Chapter 2 Methods and Statistics in 1-0 Psychotogy 


MODULE 2.4 SUMMARY 


To interpret data, it is important to consider reli¬ 
ability, which is the extent to which the measures 
are consistent over time, in different equivalent 
forms, and from one rater to another. 

There are several traditional designs for demon¬ 
strating validity, including content, criterion- 
related, and construct. All designs are intended 
to answer the question of whether better per¬ 
formance on the test or predictor is associated 
with better performance on the job. 


The question of vahdity is significant in many 
court decisions affecting workplace issues, the 
advantage being that it gives recognition to 
the value of I-O research. Its drawback is that 
the courts and legislatures have narrowed the 
range of research designs that are considered 
acceptable, and have established guideUnes that 
do not keep up with current work in the field. 


KEY TERMS 


reliability 

validity 

test-retest reliability 
equivalent forms reliability 
internal consistency 
generalizability theory 


predictor 

criterion 

criterion-related validity 
validity coefficient 
predictive validity design 
concurrent validity design 


content-related validation 
design 

construct validity 
construct 



Industrial Psychology 


wm 


^Individual Differences and Assessment 
Jdb Analysis and Performance 
: Performance Measurement 
Staffing Decisions 
Training and Development 












Individual Differences and Assessment 



Kdule 3.1 An Introduction to Individual 

KTOteckground 94 

Psychology. Psychometrics, and l-O 

Hnlifyjng Individual Differences 96 
Pieties of Individual Differences 97 


3.2 Human Attributes 100 
PUTIES 100 
Eniitive Abilities 100 
■f^jenceasY 100 
' Is'if Important at Wak? 101 
Is Y as Important in Other Countries as It Is in the United 
sues? 102 

Can Your Level of Y Change? 102 
Ihe Issue of Retesting and CoffiitiveAldlity 103 
^ Specific Cogidive Abdihes beyaid Y 103 
pysicaL Sensory, and Psychomotor Abilities 106 
^ Physical Abates 106 
.S^ Abases 108 

APsychomdtor AUrOtes 108 
psonality and Work Behavior 108 
• The Big Five and Ot)er Models of Personatty 110 
Case Study 3.1 110 

ma/Mcatons of Broad Personality Models 113 
^KMsonality Change over the Life Span 116 
j^ONAL ATTRIBUTES 115 

PWBonal InteUigence 118 

^ule3.3 Foundations of Assessment 121 
2* Past and the Present of Testing 121 
WBlIsaTest? 123 
/ hlhal Is the Meaning of a Test Score? 123 

P Psst Users and Test Inlerprelaton 124 
.Bfhal Is a Test Battery? 126 
inhere t) Find Tests 126 


Administrative Test Categories 126 
Speed versus Power Tests 126 
Group versus Inawdudi Tests 128 
Paper and Pencd versus Performance Tests 128 
Testing and Culture 128 

InlematonalAssessmetdPractces 131 

Module 3.4 Assessment Procedures 133 
Assessment Content versus Process 133 
Assessment Procedures: Content 133 
CognOveAbOtyTests 133 
Knowledge Tests 136 
Tests of Physical Abilities 136 
Psychomolor Abates 137 
Personality 138 

Practcal Issues Assodaled witi Personatty 
Measmes 138 
Integrity Testng 142 
Emotonal Inteagence 144 
Individual Assessment 145 
Interviews 145 
Interview Content 145 
Interview Process 148 
Assessment Centers 148 
Work Samples and Situational Tests 151 
Work Sample Tests 151 
SttuatonalJudgmenl Tests 152 

Module 3.5 Special Topics in Assessment 157 
Incremental Vatidily 157 
Biographical Data 158 
Grades and Letters of Recommendation 161 
Minimm Ouattcatons 162 
Useless Assessment Practces: Graphology and the 
Polygraph 162 

Drug and Alcohol Testing 163 
Computer-Based and Internet Assessment 165 
Unprocbtred Internet Testng 167 
Who Is a Candidate? 168 
Computer Adaptve Testng 168 
Testing and Demographic Differences 169 








MODULE 3.1 


An Introduction to Individual Differences 


What do Amy Winehouse, the Pope, Yo-Yo Ma, Stephen King, Carmello Anthony, 
Barack Obama, your grandmother, and your instructor have in common? Not much. They 
are different in abilities, interests, experiences, personality, age, gender, race, and back¬ 
grounds. Indeed, the only thing we can say with certainty about these individuals is that 
they are substantially different from one another. We would not expect your grandmother 
to try out for an NBA team, or Stephen King to officiate at a religious service, or your 
instructor to meet with foreign heads of state. Many psychologists, including I-O psychologists, 
believe that the differences among individuals can be used, at least in part, to understand 
and predict their behavior. 

But it isn’t good enough to say simply that people are different. You don’t have to be 
a psychologist to recognize that. Some types of differences prove more useful than others 
in predicting and understanding behavior. The differences among people on various 
attributes like intelligence, personality, and knowledge are important in understanding a 
wide variety of socially important outcomes (Lubinski, 2000), including: 

• academic achievement; 

• intellectual development; 

• crime and delinquency; 

• vocational choice; 

• income and poverty, 

• occupational performance. 

This chapter will deal first with the concept of individual differences, and then vdth how 
the assessment of these differences can help to predict occupational performance. 

Some Background 


Psychology began in a laboratory in Germany in 1876. The father of the disciphne, Wilhelm 
Wundt, was anxious to show that psychology was different from philosophy and medicine. 
Since this was a new science and the existing physical sciences like chemistry, biology, 
and physics had discovered many general principles that enhanced their importance, Wundt 
set out to uncover general principles of human behavior as well. He developed techniques 


3.1 An Introduction to Individual Differences 


^ for studying the sensations and reactions of people, examining the dimmest Ught that indi- 

I could see, the faintest sound they could hear, and how quickly they could react to 
But those who assisted in conducting his experiments quickly discovered that not 
e had the same reaction time, or could see the same dim light, or hear the same 
e. In other words, they discovered that there were differences among individuals. 

: differences detracted from the precise results Wundt sought, but to one of his 
i they represented a fascinating discovery. James McKeen Cattell (1860-1944), 
rican who received a PhD in psychology under Wundt’s direction, soon began 
ng and charting the differences among people using “psychological” variables. In 
ttell developed the concept of a mental test as a way of charting these differences, 
e subject matter of this research was differences, the study of differences became 
IS differential psychology (Tandy, 1997). 

( Alter leaving Wundt’s laboratory at the University of Leipzig, Cattell went to England 
and worked with another researcher very interested in individual differences, Francis Galton. 
ISalton was gathering information that would support his cousin Charles Darwin’s radi¬ 
cal theory of evolution. In earlier years, Galton had measured inherited characteristics like 
Nght, weight, reach, and hair color. With his new mental test, Cattell was able to expand 
the number of inherited characteristics that he could examine. After working with Galton 
for several years in developing a comprehensive mental test, Cattell returned to America 
and used this test to measure the intelligence of incoming college students. He believed 
that he could use the resulting scores to help students choose curricula and to predict who 
^uld uccessfuUy complete college. Cattell had developed methods of measuring mental 
:fbility, placing it on a scale or metric. As a result, the actual measurement of abilities became 
jknown as psychometrics. 

t ^While Freud and other early psychologists began to focus on pathological aspects of 
toental function, the pioneers of differential psychology were primarily interested in the 
mental abilities of “normal” people. Several were aware of Cattell’s work in measuring intel- 
li^nce. In France, Alfred Binet was measuring mental abilities of French school children, 
lewis Terman was conducting similar studies in California with a translation of Binet’s 
test. Hugo Munsterberg was measuring the abilities of trolley drivers in order to predict 
the likelihood of accidents. When the United States entered the First World War in 1917, 
the leading industrial psychologists of the time persuaded the Army to use an intelligence 
test to screen recruits and determine who should attend officers’ candidate school. Two 
years after the war’s end, Walter Dill Scott, one of the founding fathers of I-O psychology, 
■•^claimed that “possibly the single greatest achievement of the American Psychological 
^■■ociation is the establishment of individual differences” (Lubinski, 2000). 

W In the postwar years, intelligence tests were adapted for use in selecting individuals for 
iJohs with government and industry. By 1932 measuring the differences in intelligence among 
Individuals in order to predict things like accidents and productivity was a well-established 
^N*ctice (Tandy, 1997; Viteles, 1932). As we will see later in the chapter, intelligence is still 
°Jte of the most generally assessed characteristics of job applicants. As we shall also see, 
die hunt for new “abihties” continues even in the 21st century. We will consider two such 
ities (emotional intelligence and situational judgment) in this chapter. 


r Bifferential Psychology. Psychometrics, and 
1-0 Psychology 






rly a century later, measuring the differences among individuals to predict later 
ivior (“psychometrics”) remains one of the most common frameworks applied by 
■O psychologists. It is different from the framework used by an experimental psychologist. 


Mental test Instrument 
designed to measure a 
subject's ability to reason, 
plan, and solve problems: 
an intelb’gence test 

Oifferenlial psychology 
Scientific study of 
differences between or 
among two or more people. 


InteUigence The ability to 
team and adapt to an 
environment: often used to 
refer to general intellectual 
capacity, as opposed to 
cognitive ability or mental 
ability, which often refer to 
more specific abilities such 
as memory or reasoning. 

Menial ability Capacity to 
reason, plan, and solve 
problems: cognitive ability. 

Metric Standard of 
measurement: a scale. 

Psychometrics Practice of 
measuring a characteristic 
such as mental ability, 
placing it on a scale or 

inteUigence test Instrument 
designed to measure the 
abili^ to reason, learn, and 
solve problems. 














Chapter 3 Individual Differences and Assessment 


Psychometrician 
Psychologist trained in 
measuring characteristics 
such as mental ability. 

Cognitive ability Capacity to 
reason, plan, and solve 
problems; mental ability. 

“g” Abbreviation for 
general mental ability. 

General mental ability The 
nonspecific capacity to 
reason, team, and solve 
problems in any of a wide 
variety of ways and 
circumstances. 


The experimental psychologist usually designs an experiment that will show how all people 
are alike in their response to a stimulus, and looks outside the individual to the stimulus 
as a way to explain behavior. In contrast, the differential psychologist is person-centered, 
looking for qualities or characteristics within the person that will help us understand that 
person’s behavior. In the past, I-O psychology—particularly the apphed aspect of it— 
depended on these differences to predict things like job success, job satisfaction, and coun¬ 
terproductive work behavior. I-O psychology still makes great use of the individual 
differences approach, but it also considers factors like organizational practices, team char¬ 
acteristics, physical work and work environment design, and even broad cultural influences. 

The marriage of psychometrics and differential psychology was a good one. The dif¬ 
ferential psychologist identified what should be measured, and the psychometrician set 
about measuring it. As we saw from the work of Cattell and his contemporaries, the attribute 
most often measured—and considered most important—^was some form of intelligence, 
or cognitive ability. People use cognitive abihties to acquire knowledge, solve problems, 
and apply reason to situations. Consequently, many studies were conducted to show that 
an individual’s general intellectual capacity was closely associated with that individual’s 
occupational and vocational success. The pioneers in theories of inteUigence referred to this 
attribute as “g,” an abbreviation for general mental ability (Hull, 1928; Spearman, 1927). 
Today’s psychologists still use that term, and we will use it in this book. We have just intro¬ 
duced three terms—“intelligence,” “cognitive ability,” and “mental ability”—in rapid succes¬ 
sion. These all refer to general mental ability, and for the present discussion you may 
consider them interchangeable. Later in this chapter we will explore the distinction between 
these general abilities versus specific mental processes such as memory and perception. 

Identifying Individual Differences 


As we saw in the earlier section describing the history of individual differences, Francis 
Galton was one of the early advocates of studying such differences. In 1890 Galton wrote: 
“One of the most important objects of measurement is... to obtain a general knowledge 
... of capacities ... by sinking shafts at a few critical points” (Lubinski, 2000). By this, 
Galton meant that we can use psychometric tests to explore individual abihties and other 
attributes the way miners use drilling to explore minerals in the earth. That is an excel¬ 
lent metaphor for the study of individual differences: sinking shafts to obtain more gen¬ 
eral knowledge about behavior at work. This concept also provides a good framework 
for explaining how I-O psychologists explore individual differences today as opposed to 
25 years ago. In the past, we concentrated on only one shaft—^inteUigence. Today we are 
sinking many more shafts, as weU as deeper ones (e.g., specific mental abUities such as 
memoiy, specific aspects of personahty such as emotional stabihty^ potentiaUy new abihties 
such as emotional inteUigence). Before, we were content to stop at a more superficial level 
(“inteUigence” or “personality”). Today our explorations are broader and deeper, and we 
can reach more meaningful conclusions because the rehabihty and vahdity of our measur¬ 
ing devices are better. 

We need to keep in mind that not aU individual differences wiU teU us something import¬ 
ant. As in driUing for oU, water, or gold, we don’t always “strike it rich.” This is one of 
the reasons we do research: to see which shafts provide encouragement. For example, there 
is considerable discussion about whether emotional inteUigence is a new shaft producing 
results, or merely a shaft that connects to some shafts (e.g., personahty and inteUigence) 
that have already been drUled. 

To continue the driUing metaphor, we can distinguish among the differential psychologist, 
the psychometrician, and the apphed I-O psychologist. The differential psychologist examines 
the psychological landscape and identifies areas for driUing. The psychometrician actuaUy 


3.1 An Introduction to Individual Differences 


97 



l inks the shaft. The apphed I-O psychologist uses what comes out of that shaft—in this 
case not oU, water, or gold, but valuable predictors of performance. Later in this chapter, 
we wUl examine the actual assessment methods for examining these individual differences. 
;pIowever, you must remember (and we wiU remind you) that behavior is complex 
and people are whole. No single area of individual difference (e.g., inteUigence) is likely 
to completely (or even substantiaUy) explain any important aspect of work behavior. You 
^Esnnot separate an individual’s inteUigence from his or her personality, knowledge, or experi¬ 
ence. In fact, some environments (e.g., stressful ones) may actuaUy cancel out the effect 
I of individual variables such as inteUigence. When you look at the behavior of any indi- 
ual, you need to remember that he or she is a whole, intact entity. To acknowledge a 
on’s individuality, we need to go beyond considering just one or another possible 
ictor of his or her behavior. 

Varieties of Individual Differences 


g-oceolric model Tendency 
to understand and predict 
the behavior of workers 
simply by examining "g." 

Physical abHities Bodily 
powers such as muscular 
strength. flexibiUty. and 
stamina. 

Personality An individual's 
behavioral and emotional 
characterislics. generally 
found to be stable over time 
and in a variety of 
circumstances: an 
individual's habitual way of 
responding. 

Intorasts Preferences or 
likings for broad ranges of 
aclivilies. 

Knowledge A coUeclion of 
specific and interrelated 
tacts and intdimation about 
a particular topical area. 

Efflolion An effect or 
feeling, often experienced 
and displayed in reaction to 
an event or thought and 
accompanied by 
physiological changes in 
various systems of die 
body. 


In the past 15 years, there has been a substantial shift in thinking about individual dif- activities, 
pkiences. Instead of simply examining general mental ability (“g”) to understand and pre- ^ collection of 

S' diet the behavior of workers—a tendency that Sternberg and Wagner (1993) caUed the specific and interrelated 

■ g-ocentric model—researchers are broadening the field of examination. In addition to cog- facts and infamiation about 

^ ^tive abUity, work psychologists now consider individual differences in physical abilities, ® particular topical area. 

. Krsonality, interests, knowledge, and emotion. This is the result of several forces. In the Emolion An effector 

early years of testing, the only available tests were inteUigence tests. Today there are many feeling, often experienced 
r jj^diable methods for measuring personality, knowledge, interests, and emotional reactions 'lispfay«l i" reaction to 

to work. In addition, our understanding of the many facets of performance has become Kiamaided ^ 
more sophisticated. Instead of simply assessing overaU performance, lUce an overaU GPA, physiological changes in 
we now consider specific facets of performance such as technical task performance, various systems of ttie 
Bfe®j^bonal citizenship, counterproductive work behavior, and adaptabUity, topics we body. 
wiU address in Chapters 4 and 5. Murphy (1996) proposes that people have many differ¬ 
ent attributes that serve many different job demands (see Figure 3.1). 

Let’s apply that view to the job of firefighter, which requires driving 
file fire truck to the fire, applying water to the fire, providing medied 
BPfastance, rescuing trapped citizens, and learning new procedures and 
to use new equipment. To accompUsh these tasks, firefighters 
■fork in teams. To provide medical assistance and learn new procedures, 
file firefighter needs cognitive abUity. To rescue trapped citizens and 

■ ^ply water to the fire, the firefighter needs physical abUity, courage, 

■od problem-solving skiUs that come from experience in fitting fires. 

To accomplish teamwork and to deal with victims, the firefighter needs 

^Pfcnunication skiUs. To drive the truck to the fire accurately and safely, 
firefighter needs good vision, coordination, and the knowledge or 
ory of how to get to the location. If we only bothered to consider 
ive would only be able to understand and predict a limited portion 
of the firefighter’s job performance. To understand the fuU range of per- 
nance, we need to consider attributes beyond “g.” Box 3.1 Usts some 

fvidual difference characteristics that were identified a century ago. p,gure 3.1 The Link between Attributes and Behavior 
Tesluk and Jacobs (1998) expanded on a model proposed by in Organizations 
nes and coUeagues (1995) and suggested ways of combining the 5 ^^ 
ative measures suggested hy the latter (amount, time, and type) 
get a more complete index of experience. They also suggested that experience 
ly affects work knowledge and skUls, motivation, values, and attitudes, as weU as indir- 
y affecting job performance. Much of the emphasis in the Tesluk and Jacobs work 









J Chapter 3 Individual Differences and Assessment 


3.1 An Introduction to Individual Differences 


99 


BOX 3.1 EARLY INDIVIDUAL DIFFERENCE CHARACTERISTICS 


James McKeen Cattell began testing incoming 
students, first at the University of Pennsylvania in 
1892, then at Columbia in 1900. He wanted to 
identify the characteristics of “individual differences” 
of the students so that he could eventually predict 
which applicants for college admission were likely 
to get a degree. The following is a list of some of 
information Cattell gathered on each student: 

• Memory 

• Reasoning 

• Numerical skills 

• Reaction time 

• Hair color 


• Weight 

• Height 

• Right or left handedness 

Questions: 

1. Which of the characteristics in the list above 
are not a part of one of the categories of indi¬ 
vidual differences in this module? 

2. Which of the characteristics in the list would 
be unlikely to be related to college success? 

3. Which characteristics in the Ust do you 
think are still routinely gathered in the col¬ 
lege admissions process? 


experience model is on shaping experiences to make them of maximal value. We will return 
to the issue of shaping work experience in Chapter 7. 

There is a growing consensus (Guion, 1998; Murphy, 1996) that we can divide the indi¬ 
vidual differences useful in understanding work behavior into certain categories, including: 

• cognitive ability; 

• physical ability; 

• personality; 

• interests. 

In the next section, we will consider these broad categories of attributes, as well as the 
theories that further define them, with the exception of interests. Although measures of 
interests (and in particular vocational interests) have been around for almost 80 years, 
they have received only passing attention from I-O psychologists for two reasons. The first 
is the belief that vocational interests do not predict job performance. The second is that 
they were often thought to be in the domain of vocational counseling, only useful for 
advising students about careers and occupations. There are reasons to possibly reconsider 
the value of assessing vocational interests, but we prefer to devote space in this book to 
other topics. Therefore, we will address the possible relevance of vocational interests on 
the text website. 

Before moving on to the next section, we need to consider the following fundamental 
assumptions that I-O psychologists make when they apply the individual differences 
model (adapted from Guion, 1998). 

1. Adults have a variety of attributes (e.g., intelHgence, personality, interests) and the 
levels of these attributes are relatively stable over a reasonable time period (several 
years). 

2. People differ with respect to these attributes (i.e., there are “individual differences”) 
and these differences are associated with job success. 


I 3. The relative differences between people 
on these attributes remain even after 
training, job experience, or some other 
intervention. Thus, if individual A has 
less of an attribute than individual B 
before training or job experience, and 
if they both receive the same training 
or experience to increase that attribute, 
individual A will still have less of that 
attribute than individual B after the 
training or intervention, even though 
both may have higher levels of the 
attribute after training or experience. 

4. Different jobs require different 
attributes. 

5. These attributes can be measured. 



R-With these assumptions in mind, we can 
[ now examine these attribute categories in the next modules. 


PI0DULE3.1 SUMMARY 


The performance of most jobs requires multiple abilities. What are some of the abilHies 
called for in the job of firefighter? 


' The individual differences among people on vari¬ 
ous attributes like intelligence, personality, and 
1 knowledge are important in understanding a 
• wide variety of socially important outcomes. 

James McKeen Cattell developed the concept of 
a mental test. Since the subject matter of this 
i research was differences, the study of differ¬ 
ences became known as differential psychology. The 
actual measurement of abilities became known 
as psychometrics. 

The differential psychologist is person-centered, 
looking for characteristics within the person 


that will help explain that person’s behavior. 
The differential psychologist identifies what 
should be measured, and the psychometrician 
measures it. 

Early differential psychologists most commonly 
measured intelligence, or cognitive ability. They 
referred to this attribute as “g,” an abbreviation 
for general mental ability. 

In addition to cognitive ability, I-O psychologists 
consider individual differences in physical abili¬ 
ties, personality, interests, knowledge, and emotion. 


key TERMS 


individual differences 
I'mental test 

ifferential psychology 
Rdntelligence 
\ mental ability 
femetric 


psychometrics 
intelligence test 
psychometrician 
cognitive ability 
“g” 

general mental ability 


g-ocentric model 

physical abilities 

personality 

interests 

knowledge 

emotion 








MODULE 3.2 



ABILITIES 


Taxonomy Anorderty. 
sciontific systom of 
classification. 

Perceptual-motor abilities 
Physical attributes that 
combine the senses (e.g.. 
seeing, hearing, smelt) and 
motion (e.g.. coordination, 
dexterity). 

Affect The conscious, 
subjective aspect of 
emotion. 


In the 1950s, Edwin Fleishman began a program of research to determine the most com¬ 
mon mental and physical abilities associated with human performance, including work 
performance. The result was a comprehensive list, or taxonomy, of 52 abilities (Fleishman 
& Reilly, 1992), which can be divided into the broad categories of cognitive, physical, and 
perceptual-motor abilities. This taxonomy is more detailed than we need to deal with at 
this point; the interested student can refer to the full taxonomy on the text website. The 
abilities Fleishman identified cover an impressive variety—and they do not cover personality, 
affect, or interest! 

Fleishman’s list of abilities can be used for many different applied purposes. It is an 
effective way to analyze the most important abilities in various occupations (Tandy, 
1989). It can also be used to determine training needs, recruiting needs, and even work 
design. In Chapter 4, you will see how Fleishman’s abUity list contributed to the develop¬ 
ment of a comprehensive expert computer system called O’^NET that connects human 
abilities with job demands. 


Cognitive Abilities 


IQ Abbreviation for 
intelligence quotient 

Intelligence quotient 
Measure of intelligence 
obtained by giving a subject 
a standardized "ir test The 
score is obtained by 
multiplying by 100 the ratio 
of the subject’s mental age 
to chronological age. 


Intelligence as "g" 

As we mentioned in Module 3.1, many people consider the terms intelligence, IQ, cogni¬ 
tive ability, and mental abihty to be synonyms for one another. In general, we agree—^but 
it is important to understand how some psychologists distinguish among them. IQ is a 
historical term that stood for Intelligence Quotient and refers to the way early intelligence 
test scores were calculated. The term no longer has scientific meaning, although it is still 
often used by the general public. Intelligence can be defined as the ability to learn and 
adapt to an environment. One or another variation of this definition has been used since 
at least 1921 (Sternberg & Kaufmann, 1998). A group of leading I-O psychologists defined 


3.2 Human Attributes 


101 


as follows: “InteUigence is a very general 
■mental capability that, among other things, 
ivolves the ability to reason, plan, solve 
loblems, think abstractly, comprehend com¬ 
plex ideas, learn quickly, and learn from 
perience” (Arvey et al., 1995). 

It might be easier to think of this very 
neral ability the same way as we think of 
»ne as “athletic.” We don’t mean that the 
person is an expert at every sport, just that 
the person is coordinated, picks up new 
^rts easily, and is usually better at sports than 
leone we might call “unathletic.” There are 
:ific subparts to doing well at a particular 
q>ort like golf or baseball or swimming, but 
' ’ being “athletic” seems to generally capture 
. I^inany of those subparts. Similarly, when we 

refer to someone as intelligent, we imply Aat emergency dispatcher include verbal comprehension, reaction 

he or she would be good at a range of activ- and problem solving, 

ibes that require learning and adaptation. 

^jStemberg and Kaufmann (1998) pointed 

■ out that no matter how enduring this definition may be for Western cultures, other cul- 
V- tures have different views of who is “an intelligent person.” Speed of learning, for exam¬ 
ple, is not always emphasized in non-Western cultures. In fact, “other cultures may be 
ll^icious of work done quickly” (Sternberg & Kauftnann, 1998), and in some cultures, 
the word intelligence means “prudence” and “caution.” Nevertheless, for our purposes, 

will accept the meaning generally assigned by Western psychologists. Intelligence is 
uired whenever people must manipulate information of any type (Murphy, 1996). 
easures of “g” assess reasoning ability, knowledge acquisition, and problem-solving 
•bility (Lubinski, 2004). 

■ 

^ Is Y Important at Woi1<? 

Yes. Almost every job requires some active manipulation of information, and the greater 
Ae amount of information that needs to be manipulated, the more important “g” 

- Pecomes. Meta-analyses of the relationship between “g” and job performance (Schmidt 

■ & Himter, 2004) have demonstrated very clearly that as the complexity of the job 

- pbcreased, the predictive value (i.e., validity) of tests of general intelligence also increased. 

, This means that if the information-processing demands of a job are high, a person with 

■ “g” is less likely to be successful as a person of higher “g.” That does not mean, 

®»wever, that high “g” guarantees success on that job. If the job also requires interper- 

;-^sonal skills, communication skills, and certain personality traits, even a person with high 
. J-g” (but lower levels of those noncognitive traits) might fail. 

T In 1965 Tanner showed that he could accurately predict which Olympic athletes were 
. JpDmpeting in which sports by looking at their body builds. But within each Olympic event, 

1 the same individual differences were useless as predictors of who would get a medal (Lubinski, 
a-2000). In this example, think of body build as “g,” and all the other attributes of the ath- 
1 letes as specific abilities and attributes; “g” may help a candidate get into the pohce academy, 
but it w^ not ensure that the person will become a successful police officer. 

Some, but far from all, of today’s psychologists continue to beheve that nothing more 
•ban measures of “g” are needed to predict training, grades, and job performance. An excel¬ 
lent review of the debate can be seen in an entire issue of the journal Human Performance 


Meta-analysis Statistical 
method Ibr combining and 
analyzing the results from 
many studies to draw a 
general conclusion about 
relationships among 
variables. 











Chapter 3 Individual Differences and Assessment 


Flynn effect Phenomenon 
in which new generations 
appear to be smarter than 
their parents by a gain of 
15 points in average 
intelligence test score per 
generation: named after the 
political scientist who did 
extensive reseiech on the 


mean The arithmetic 
average of the scores in a 
distribution; obtained by 
summing atl of the scores 
in a disbibubon and 
dividing by the sample size. 

Standard deviation 
Measure of the extent of 
spread in a set of scores. 


devoted to the topic (Ones & Viswesvaran, 2002). One psychologist framed the issue as 
follows. 

General mental ability (g) is a substantively significant determinant of individual differences 

for any job that includes information-processing tasks_The exact size of the relationship 

wiQ be a function of... the degree to which the job requires information processing and ver¬ 
bal cognitive skills. (Campbell, 1990a) 

From Campbell’s statement we can infer that because “g” represents information-processing 
ability, then it should logically predict information-processing performance in the work¬ 
place. In addition, we can infer that jobs differ in terms of not only how much “infor¬ 
mation processing” they require, but also how quickly that processing must be completed. 
A backhoe operator or a nutritionist certainly has to process some information, but not 
as much or as quickly as a software help-desk operator or an air traffic controller. 
The backhoe operator will depend much more heavily on visual/spatial ability than on 
problem-solving or reasoning ability. The nutritionist will depend more heavily on acquired 
knowledge of the composition of various foods and the nutritional needs of a client. 

Is “g as Important in Other Countries as It Is in the United States? 

The simple answer seems to be “yes,” at least as far as Europe is concerned. Several meta¬ 
analyses have been published demonstrating the predictive value of “g” in the European 
Union (E.U.) (Salgado & Anderson, 2003; Salgado, Anderson, Moscoso, Bertua, & DeFruyt, 
2003; Salgado, Anderson, Moscoso, Bertua, DeFruyt, & Holland, 2003) and specifically in 
the U.K. (Bertua, Anderson, & Salgado, 2005). Salgado and Anderson (2002) also report 
that the use of tests of mental ability is even more prevalent in the E.U. than in the U.S. 

Much less is known about non-European countries. As long as globalization is controlled 
by Western nations, it has been fairly safe to assume that “g” would remain important in 
non-European countries as well. But in the near future, as China emerges as a dominant 
global player, it will be interesting to see how important a role “g” continues to play in 
the global economy. It is possible that with such a massive potential employee population 
in China, success in the global arena may be defined in terms of how many people you 
throw at a project rather than hiring only the most intelligent applicants. In 2008, the Chinese 
workforce was estimated at 776 million, compared to 150 million workers in the U.S. 
By the year 2016, the Chinese workforce will have grown by an additional 221 miUion 
(22 percent) versus a growth of 22 million (.8 percent) in the U.S. A similar comparison can 
be made between the U.S. and India. The Indian workforce was approximately 435 million 
in 2008 and is projected to grow by 80 million (18 percent) by 2016. We may see a dif¬ 
ferent, and possibly diminished, role for “g” in both China and India over the next few 
decades. Conversely, it may be that, at least on the global stage, the U.S. will need to com¬ 
pete by working “smarter” rather than through massive numbers of cheap workers, 
making “g” even more important than it has been before. 

Can Your Level of“g Change? 

Today’s researchers observe a fascinating phenomenon: InteUigence continues to rise over 
time. Individuals appear to be getting smarter and smarter through the fife span, and new 
generations appear to be smarter than their parents. The phenomenon, labeled the Flynn 
effect after a political scientist who has done extensive research on the topic (Flynn, 1999), 
amounts to a gain of 15 points in average inteUigence test scores per generation. This is 
a substantial increase, considering that the mean inteUigence on most tests is pegged at 
100 with a standard deviation of 15. There are many theories as to why this is occurring. 


3.2 Human Attributes 


103 




uding better health care, better nutrition, increased schooling, and better-educated 
ents (Sternberg & Kaufinaim, 1998). Another factor may be the increasingly complex 
onment we live in both at work and at home (Neisser et al., 1996). This phenom- 
5on exists in the U.S. and many other countries (Daley, Whaley, Sigman, Espinosa, & 
ann, 2003). So, at least at a generational level, the answer seems to be that on the 
ge, your generation wiU be “smarter” than your parents’ generation. Within gener- 
tfions, however, “g” appears to be more stable (Lubinski, Benbow, Webb, & Bleske-Recheck, 
2006; Wai, Lubinski, & Benbow, 2005). And as we get older, the inteUigence level we pos- 
d when we were younger may become even more stable (Bouchard, 2004; Plomin 8c 
Spinath, 2004). So there is good news and bad news. The good news is that your genera- 
1 tion is likely smarter than your parents’ generation and that your level of “g” wiU possibly 
J Pncrease. The bad news is that as you age, the amount of change wiU get smaUer—so start 
working on it now! 

We Issue of Retesting and Cognitive Ability 

It is commonly said that human abUities (e.g., cognitive abiUty) are relatively endur¬ 
ing, meaning that they do not change rapicUy. But might cognitive abUity increase, 
tanain stable, or decrease across multiple testing occasions? This is a practical issue for 
'' many employers since they screen job appUcants on a regular basis and would Uke to reduce 
the costs of testing. This means not continuing to retest the same applicant if his or her 
i pKore is unlikely to change. A related issue is that many employers offer an applicant who 
jji does poorly on a test of cognitive abiUty the opportimity to retake that test. 

^^^^ere seems to be little doubt that retesting results in higher scores, on the average 
^^fttusknecht, Halpert, Di Paolo, 8c Gerrard, 2007; Elevens, Buyse, 8c Sackett, 2005; 
nond, Neustel, 8c Anderson, 2007). We wiU address this issue again in the training 
pter when we cover the concept of test coaching. But the important question is, which 
icore is the more accurate measure of what is being assessed: the initial or the retest score? 
ns. Reeve, and Heggestad (2007) demonstrate that the initial test score is the better 
cator of general mental abUity because the retest score is heavily influenced by one 
ct of general mental abihty—memory. Further, they show that the validity (i.e., the 
: of the test score to predict performance) actually declines when the retest score is 
as the predictor. In practical terms, if an employer seeks to hire apphcants who 
Jcore high on “g,” then the retest score is not as valuable as the initial score. The further 
lication is that the score “increase” from retesting is “hollow” because it does not sig¬ 
nal a real increase in “g,” only a bump in the score due to a specific cognitive abUity— 


dfic Cognitive Abilities beyond “g" 

"nie majority of today’s psychologists agree that although “g” is important, more specific 
refined cognitive abUities also play a role in performance, with some specific abUities 
portant for some jobs and other specific abUities important for other jobs. To return 
the example of an “athletic person,” your roommate or partner may be incredibly (dis- 
ingly?) athletic, but is uiUikely to go head-to-head with Tiger Woods or Laura Ochoa 
the golf course. The pros must have something (e.g., coordination, far visual acuity, 
urance, joint flexibUity) that your roommate does not have. These would be specific 
ysical abilities. This holds true for cognitive abUities as weU. There are wide variations 
the accomphshments of “inteUigent” people, specific cognitive abilities beyond general 
elligence. 

A question then arises: How many specific abUities are there? There is no conclusive 
swer to that question, but we can say with great confidence that there is more than one 


4 


Chapter 3 Individual Differences and Assessment 


i 

I 

Broad abiUties 


FIGURE 3.2 CarroU's 
Hierarchical Model 

SouiCE: Carroll (1993). Specific abilities 


(i.e., more than just “g”). As we mentioned 
earlier, Fleishman and his colleagues posited 
52 abilities, 21 of which are in the cognit¬ 
ive category, but “g” is not one of them. 
Fleishman was more concerned with identi¬ 
fying specific abilities than general mental 
ability. It is now generally accepted that cog¬ 
nitive ability is best conceptualized as having 
multiple layers of abilities. 

Carroll (1993) proposed that there are 
three layers, or strata, to intelligence (see 
Figure 3.2). The highest layer is “g”; the next 
layer down consists of seven more specific 
abilities; fluid intelligence, crystallized intelli¬ 
gence, memory, visual perception, auditory 
perception, information retrieval, and cogni¬ 
tive speed (Murphy, 1996). The lowest and 
most specific level includes abilities that 
“Don't give me too much, I’m not good-with money.’ are tied to the seven broad abilities in the mid- 

Smirce © The New Yorker Collection 2008 Peter C. Vey from cartoonbank.com. Ati rights reserved. example, information ordering 

(one of Fleishman’s proposed abilities) would 
be connected to fluid intelligence, and spatial relations would be associated with visual 
perception. 

There are many other theories of specific cognitive abilities (e.g., Ackerman, Beier, & 
Boyle, 2002, 2005), but all resemble Carroll’s. The important thing to remember is that 
“g” will only get you so far in understanding work behavior. It is fair to say that a person 
with a high level of “g” will probably succeed at certain tasks of almost every job, particu¬ 
larly complex jobs (Schmidt & Hunter, 1998), but that depending on the job, other abil¬ 
ities such as personality, emotional reactions, and interests will also play a role in job success. 

Recently, some researchers have begim to nominate and explore very specific cognitive 
abilities as predictors of work and vocational success. David Lubinski and his colleagues 
at Vanderbilt University have been studying intellectually talented adolescents for many 
decades, following them from age 13 through adulthood. Although Lubinsky agrees that 
general mental ability exerts substantial success in adult vocational accomplishments, he 
also finds that the difference between SAT math and verbal scores has implications for 
which careers are chosen. The adolescent who had higher scores in math was more likely 
to pursue a career in the sciences and technology and more likely to secure a patent (Park, 
Lubinski, & Benbow, 2007). In contrast, adolescents with higher verbal scores were drawn 




3.2 Human Attributes 


105 



I Does reading materials on electronic devices involve different abilities from traditional roading of printed materials? 


I u ever thought about the fact that parents 
dren read differently? It is not so much that 
1 different material (although you would be 
1 to see your parents reading Perez Hilton’s 
the web!), but that they read using differ- 
ia. Many adults still spend a great deal of 
iding print material (newspapers, books, 
es), while their children read web-based 
This has prompted some educators and test- 
;rts to suggest that some abilities may be 
to Internet reading, and that the speed 
iracy of this reading ability is not captured 
ng textual material. The result is a lively 
■jft debate that involves organizations like the National 


Council of Teachers of English, the International 
Reading Association, and the Organization for 
Economic Cooperation and Development. Many 
European coimtries are beginning to assess electronic 
reading in addition to more traditional textual 
reading. Advocates of the assessment of electronic 
reading propose that this 21st-century mode of 
reading uses abilities such as searching for answers 
to questions in real time, navigating the web, evalu¬ 
ating information from multiple and occasionally 
conflicting sources, and communicating through web 
posts and blogs. 

Source: Based on material fi-om Rich (2008). 


to the humanities rather than science and were more likely to publish a novel. Although 
these gifted adolescents, generally, had high levels of creative accomplishment, the area of 
those accomplishments was influenced by the predominance of verbal or mathematical 
Lubinski and his colleagues (Webb, Lubinski, & Benbow, 2007) also found that 
^tial ability (i.e., the abihty to visualize what an object in 3-dimensional space would 
if it were rotated) added to mathematical ability in predicting career choice and 
WWcational success in math and science. Even though these studies tell us more about 
ow careers evolve, they do seem to point to the importance of specific cognitive abilities 
k to certain vocational domains. 






106 


Chapter 3 Individual Differences and Assessment 


3.2 Human Attributes 


107 


One final specific dimension of cognitive ability—memory—has received attention from 
I-O psychologists. Each computer has a storage capacity, usually indexed as gigabytes. More 
gigs means more memory, and more memory means great power (both speed and capacity). 
Konig, Buhner, and Mtirling (2005) proposed that humans are like computers with 
respect to working memory, and that some individuals have more than others. They fur¬ 
ther proposed that people with more working memory are better at juggling multiple tasks 
simultaneously (known as multi-tasking). Multi-tasking is an important part of many work¬ 
ing lives (e.g., completing a report while answering e-mail messages and phone calls and 
questions from the person in the desk next to you). The researchers measured working 
memory by a person’s capacity to keep things (e.g., sentences, positions of objects) fresh 
in his or her mind. Not surprisingly, they discovered that the more working memory a 
person had, the better he or she was at multi-tasking. Thus, there is reason to propose 
including a measure of working memory in the assessment of applicants for jobs that require 
substantial amounts of multi-tasking. 

We simply do not know as much as we need to about the contribution of specific cog¬ 
nitive abilities to work performance, as general mental abiUty (or “g” or “intelligence”) 
dominated the research and application scene for well over 80 years. Applied psycholo¬ 
gists are only now beginning to re-examine the contribution of specific cognitive abilities 
to behavior. We expect that this re-examination will yield substantial insights over the next 
10 years. Another interesting sidelight of this research on working memory relates to the 
earlier research we reviewed on the effect of retesting effects on cognitive ability test scores. 
It would seem that individuals with large working memories would profit (at least in terms of 
their test score) more from retesting than would individuals with smaller working memories. 

You might wonder if it is possible to have too much intelligence. This reminds us of a 
story that is often told about a well-known boxer. A radio interviewer was talking with a 
retired middleweight boxer who had fought for many years and had a relatively undis¬ 
tinguished career, finishing with approximately 60 wins and 30 losses. The interview went 
something like this, with the interviewer represented by “I” and the boxer by “B.” 

i: You must have fought many interesting boxers in your career. 

b: Yeah, there were plenty of dem. 

i: I noticed that you fought so-and-so four times and beat him all four times. 

b: Yeah, that surprised me because he had a lot better record than me. 

i: Why did it surprise you? 

b: Because he was so smart. He was always thinking ahead, what combination he would set 
up, where he wanted to be in the ring, and things like that. He was really smart, always 
thinking. 

i: Then let me ask the obvious question—^why do you think you beat him so consistently? 

b: I guess it was because when he was thinking, I was punching. 

So it does appear that, occasionally, too much “g” can get you hurt! 

Physical Sensory, and Psychomotor Abilities 


Physical Abilities 

Hogan (1991a, b) suggested that seven physical abihties are sufficient for analyzing most 
jobs. Guion compared Hogan’s seven abilities with similar abihties identified by Fleishman 
and Reilly (1992) and found a close match. As you can see in Figure 3.3, several of Hogan’s 
dimensions are combinations of the Fleishman and Reilly (1992) dimensions (e.g., she 
combines extent flexibhity and dynamic flexibihty into a single dimension called “flex- 








FIGURE 3.3 A Model of Physical Abilities 


I ibfey”). In a manner reminiscent of Carroll’s theory of inteUigence, Hogan then combines 
^r seven measures to form three higher-order physical abilities; muscular strength, car- 
ryascular endurance, and movement quahty. For most jobs, this three-abihty taxonomy 
uld likely be sufficient because most physically demanding jobs require muscular 
don, muscular power, and muscular endurance, not just one of the three. Similarly, 
Ibihty and balance usually go together in a physically demanding job. 

Fairness of Physical Ability Tests 

■jcause employers often use physical abihty tests to screen apphcants for physically 
P^nanding jobs, it is important to determine whether such tests are fair to female appli¬ 
cants and older applicants. Because we lose muscle, stamina, and flexibihty as we age, the 
older an apphcant is, the less weU he or she is likely to perform on physical abihty tests. 
For women the situation has an additional consideration. On average, females have less 
S*huscle mass (which means diminished muscular strength) and lower levels of cardiovas- 
odar endurance (or stamina) than men (Hogan, 1991a). In contrast, on measures of flexi- 
(e.g., sit and reach tests), women tend to do better than men. However, most physicaUy 
iding jobs require—or are perceived by employers to require—more muscular 
th and stamina than flexibihty. This has meant that male candidates, who tend to 
**cel on those physical tests, are predominantly hired for such jobs. As a result, women 
P^Bdidates for popular positions such as firefighter have filed employment discrimination 
suits (Brunet v. City of Columbus, 1995). 

Women and men of ah ages can increase their individual physical abihties with exer- 
®se and training. In addition, many jobs require a fixed level of strength and endurance, 
P*yond which more is not always better. If your job requires you to lift 25-pound boxes, 
fhe feet that you are strong enough to move 100-poxmd boxes is irrelevant. In this case, 
®ore strength would not lead to higher performance. Thus, individuals do not always have 
to compete against each other on physical abihty tests; they merely need to demonstrate 
Efficient strength and endurance to perform the job tasks. By training for several months 
to taking physical abihty tests, women candidates can improve their performance 
■Nficantly. Thus, one way of helping women to do better on these tests is for employers 
to tocourage them to train ahead of time (McArdle, Katch, & Katch, 2001). We can pre- 
that this same strategy may help older job seekers as weU. 


i ^^cularenc 

^^Hpnandi 

^^Rength 


Muscular tensHHi Physical 
quality of muscular 
stranglh. 

Muscular power Physical 
ability to lift, pull push, or 
otherwise move an object; 
unlike endurance, this is a 
one-bme maximum effort 

Muscular endurance 
Physical ability to continue 
to use a single muscle or 
muscle gmq) repeatedly 
over a period of bme. 

Stamina Physical ability 
to supply muscles with 
oxygenated bload through 
the cardiovascular 
system; also known as 
cardiovascular strength 
or aerobic strength or 
endurance. 








Chapter 3 Individual Differences and Assessment 


Employers are usually eager to contain the cost of medical and disability programs for 
workers—if possible by predicting who is likely to experience an injury and rejecting those 
appUcants. Physical ability tests have been used as the predictor for future injury. The prob¬ 
lem is that while they may be good (but far from perfect) predictors of future injury, such 
tests may not be particularly relevant for present or future job performance. In a case against 
Armour Star Meat Packing facihty in Iowa, 52 women successfully sued the company for 
denying them jobs based on a strength test. A federal judge awarded $3.3 milhon to the 
women because the test was used to predict injuries, not performance on the job 
(Business and Legal Reports, 2005a). 

Sensory Abilities 

Sensory abilities are the physical functions of vision, hearing, touch, taste, smell, and kines¬ 
thetic feedback (e.g., noticing changes in body position). Hogan includes kinesthetic feed¬ 
back in a dimension she calls “movement quality.” The sensory abilities of vision and hearing 
are particularly interesting for applied I-O psychologists because employers often test these 
abilities in would-be employees. 

To prevent employers from using a disability as an excuse to reject an applicant who is 
capable of performing a job, the Americans with Disabilities Act of 1990 forbids them 
from asking about or testing areas such as sensory or physical abilities that may be con¬ 
sidered “disabilities” until after they have made a job offer to the candidate. 

Until recently, cognitive psychologists considered sensory abUities to be independent of 
cognitive abilities, but Carroll’s (1993) model of intelhgence calls that assumption into 
question—remember that two of his mid-level abilities are visual perception and auditory 
perception. In addition, Ackerman’s research (e.g., Ackerman et al., 2002, 2005) shows 
the close association between perceptual speed and other measures of cognitive ability. 
But in most real-life settings, sensation and perception are inextricably bound together. 
We usually infer from some kind of report (verbal or behavioral) that a person has sensed 
something. Further research will shed light on the extent to which “noncognitive” abili¬ 
ties are really “noncognitive.” 

Psychomotor Abilities 

Psychomotor abilities Psychomotor abilities, sometimes called sensorimotor, or just motor abihties, deal with 

Physical functions of issues of coordination, dexterity, and reaction time. Once again, Fleishman (Fleishman & 

move^ asM ciated with Reilly, 1992) has done the most extensive work in identifying these abihties. We can eas- 

coorriination. dextenty. and ' . , r , • . , . . , 

reaction time- also called name some jobs tor which they may be unportant (e.g., crane operators, orgamsts, watch 

motor or sensorimotor repair technicians, surgeons, wait staff, and bartenders). From this discussion it should be 

abilities. clear that many psychomotor abihties (e.g., rate control and aiming) may very weh be asso¬ 

ciated with visual and/or auditory perception or cognitive speed, facets of Carroh’s the¬ 
ory of inteUigence. See Box 3.3 for a discussion of reaction time in work situations. 

The work of researchers like CarroU and Ackerman blurs the classical distinctions between 
cognitive and “noncognitive” abihties. In some senses, this is a good development, for it 
is clear in real hfe (and—more importantly for us—in work) that ah of these abihties inter¬ 
act within a single person to produce a response or action. 

Personality and Work Behavior 


Personality is reaUy a big deal in I-O psychology, probably the biggest deal since the con¬ 
sideration of the role of inteUigence in work behavior about 100 years ago. There is now 


Senswy abilities Physical 
functions of vision, hearing, 
touch, taste, smell and 
kinesthetic feedback (e.g.. 
noticing changes in body 
position). 

Americans with Disabilities 
Act Federal legislation 
enacted in 1990 requiring 
employers to give 
applicants and employees 
wKh disabilities the same 
consideration as other 
applicants and employees, 
and to make certain 
adaptotions in the work 
environment to 
accommodate disabilities. 


3.2 Human Attributes 


109 


^ ■ It is interesting to note that some specific mental 
■||fcihties show remarkable stability weh into the life 
span. As an example, reaction times change very 
lUttle between age 30 and age 65, “slowing” by 
, ifcerhaps 1/20 of a second from .45 seconds to 
1^50 seconds. TechnicaUy it is true that reaction 
T^fiminishes” with ^e, but what are the practical con- 
lAquences for most jobs of a diminution of .05 sec- 
^{onds? Neverthless, Ae behef that older workers are 
t^slower” may influence staffing decisions. As an 
>«Btample, the Vermont State PoUce requires troop- 
Hfers to retire at age 55 because of the fear that 
this “diminution” of reaction time might lead to 
^■phngerous behaviors (e.g., drawing and firing a 


weapon “too slowly”). Not only is that nonsense 
from a physiological standpoint, but with the 
advent of the semi-automatic handheld weapons 
issued by most pohce departments, more problems 
have arisen because officers fire their weapons too 
rapidly, not because of lags in reaction times. Two 
famous examples occurred in New York City when 
civilians Amadou DiaUo (Cooper, 1999) and Sean 
BeU (Buckley & Rashbaum, 2006) were each kiUed 
in a hail of buUets by pohce officers who reacted 
quickly. In retrospect, slower rather than faster 
reaction time might have been in order in each case. 

Source: Cooper (1999); Buckley & Rashbaum (2006). 


Hnoad consensus that personahty predicts not only general behavior and happiness (Steele, 
Iflimidt, & Shultz, 2008), but more specificaUy work-related behavior. This work-related 
l^vior includes performance, absenteeism, counterproductive work behavior, and team 
^veness. Barrick and Mount (2005) provide the foUowing seven reasons why “per- 
nahty matters in the workplace” and document those reasons using recent research: 

1. Managers care about personality. In hiring decisions, they weigh personahty char¬ 
acteristics as heavily as they do intelhgence. 

2. A wide array of empirical research studies show the importance of various personality 
measures in predicting both overaU job performance and specific aspects of per- 

‘ formance. Personality addresses the “wih do” aspects of performance while inteUi- 
‘ gence addresses the “can do” aspects of performance. 

3. When we combine personahty characteristics (rather than considering them one at 
I a time), the relationship between personality and work performance becomes even 
• stronger. 

^ Personahty measures improve the predictabihty of work performance over what would 
be found using only measures of intelhgence or experience. 

' 5- There are much smaUer differences among age, gender, and ethnic minority sub- 

! ' groups on measures of personahty than on measures of intelhgence. 

4- Personahty measures predict not only near term behavior (e.g,. current job per¬ 
formance) but also distant outcomes such as career success, job and hfe satisfac¬ 
tion, and occupational status. 

' 2. Personahty measures predict a wide variety of outcomes that are important to 
managers, including coimterproductive behavior, turnover, absenteeism, tardiness, 
I group success, organizational citizenship behavior, job satisfaction, task performance, 
[ and leadership effectiveness. 

|[_h the last few years, three major journals {Human Performance, 2005; International Journal 
miction and Assessment, 2007; Personnel Psychology, 2007) have addressed the role of 






10 Chapters Individual Differences and Assessment 



personality in understanding work behavior. The interested reader can use these sources 
to examine the issues and promise of personality in understanding work behavior. 
Although these treatments highUght some disagreements among researchers (e.g., Hogan, 
2005; Morgeson et al., 2007a, 2007b; Roberts, Chernyshenko, Stark, & Goldberg, 2005; 
Tett & Christiansen, 2007) regarding how many personality factors there are, how to assess 
them, and how predictive they are of work performance, the general conclusion confirms 
our position: that personality is a big deal in understanding work behavior. 

The Big Five and Other Models of Personality 


BigS A taxonomy of five 
personality factors; the Five 
Factor Model (FFM). 

Five Factor Model (FFM) A 
taxonomy of five personality 
factors, composed of 
conscientiousness, 
exiravension. agreeafaleness. 
emotional stability, and 
openness to experience. 


Historically, the interest in personahty measurement (both in general human behavior and 
more specifically in work behavior) began with the development of a taxonomy of per¬ 
sonality dimensions labeled the Big 5 or the Five Factor Model (FFM) (Digman, 1990; 
McCrae & Costa, 1985, 1987). According to this model, an individual’s personahty can 
be described by where that individual falls on five dimensions: openness to experience, 
conscientiousness, extraversion, agreeableness, and neuroticism (the opposite of emotional 
stabihty); a useful mnemonic acronym for the five scales is “OCEAN.” The FFM was the 
result of both statistical analyses of personahty test information gathered over many decades 
and a careful conceptual analysis of what most personahty tests were trying to assess. The 
FFM is a good way to gain a broad understanding of the structure of personality, but it 
may be a bit too general for deahng with specific aspects of work behavior. In fact, many 
work-related personahty models have developed from the more generic FFM and seem to 
have more relevance to work behavior. These include the Five Factor Model Questionnaire 
(FFMQ; GUI 8c Hodgkinson, 2007), the Hogan Personahty Inventory (Hogan, Davies, 8c 
Hogan, 2007), and the Personal Characteristics Inventory (Mount 8c Barrick, 2002). 


CASE STUDY 3.1 A LEVEL PLAYING FIELD 


It is common to test for physical abihties before 
choosing candidates for recruit positions in fire 
academies. Although physical abUities wiU be 
improved in the 16 weeks of the academy training 
program, you stUl require a minimum amount of 
abUity to profit from the training. Most fire 
departments administer physical abUity tests that 
simulate actual tasks performed by firefighters. As 
examples, canchdates may be asked to carry heavy 
hose bundles up stairs, open fire hydrants with 
wrenches, or hang heavy exhaust fans in windows. 
Two tests, in particular, seem to be harder for 
female apphcants than their male counterparts. 
The first is the “dummy drag” simulation. In this 
test, the candidate is asked to drag a 150-poimd 
dummy through a 40-foot maze with several left and 
right turns in it. The second task is pulling 50 feet 
of a simulated fire hose through a 50-foot maze with 


two right turns. Since men tend to be larger 
and stronger, they simply pick up the dummy and 
carry it through the maze, while women are more 
likely to drag the dummy along the floor of the maze. 
Similarly, for the hose pull, men tend to simply loop 
the hose over their shoulder and pull it through 
the maze in one single movement. The test is not 
exactly the same as the actual task, however; in an 
actual fire situation the firefighter is usually pulling 
a person or a hose through a burning room and must 
stay close to the groimd because the toxic fumes, 
smoke, and temperature (often as high as 2,000 
degrees) are more deadly in the upper part of a room. 

If you wanted to make these test components 
more realistic, how would you redesign the test 
course? If you did redesign it, do you think that the 
performance of women would improve? Why or 
why not? 


iyLE3.1 The Five Factor Model 


1 ^ 

CHARACmiSnCS 

Kmou^ 

Responsible, prudent setf-conlrol persistent planfiit achievement oriented 

Kvemon 

Sociable, assertive, talkative, ambitious, energetic 

Bjj^iMeness 

Good-natured, cooperative, busting, b'kabte. friendly 

^Ffinotional stability 

Secure, calm, low anxiety, tow emotionality 

^^Hbness Id experience 



on Digman (1990): McCrae & Costa (1985.1987). 


fevirtually all modem personality models resemble the Five Factor Model (FFM) in that 
they propose that we can describe someone’s “personality” by looking at some small num¬ 
ber of relatively independent factors. Personahty can be defined in simplest terms as the 
typical way that an inividual has of responding. It is considered a collection of traits because 
it is furly stable, even though situations and circumstances might lead a person to behave 
in a way that is out of character with his or her overall personality. Using the FFM as an 
kample, the model identifies five different components that, when taken together, give a 
fair representation of how a person typically responds to events and people (see Table 3.1). 
^ dderable evidence suggests that although the five factors might express themselves in 
itly different ways in various cultures, the FFM seems apphcable cross-culturahy 
leung, 2004; McCrae, Terracciano et al., 2005; Tsaousis 8t Nikolaou, 2001) and that 
'cnltiure and personality may be hnked (Hofstede 8c McCrae, 2004). 

It is important to keep in mind that personality factors are intended to measure 
aormal personality, not to identify any evidence of psychopathology. We will make that 
dnction clearer later in this chapter when we discuss how personality is measured. 

leless, some recent research suggests that even the FFM can be used to explore the 
rk side” of personality—^not quite pathological, but definitely undesirable behavior (Kaiser 
& Hogan, 2006). Of the five FFM factors, the first to attract attention fi-om I-O psycho- 
Jogists was conscientiousness. More recently, extraversion, openness to experience, and agree- 
less are also attracting increased attention (Barrick 8c Mount, 2005). In some early 
"ch, Barrick and Mount (1991) proposed, on the basis of a meta-analysis, that con- 
itiousness was likely positively related to success in all aspects of work for all occu¬ 
pations. That was a strong statement, but it was supported by their analyses. Naturally, 
there were disagreements with the five factor taxonomy and with the presumed overar¬ 
ching importance of conscientiousness. The first disagreement was that five factors are too 
few to capture the full range of aspects of personahty (Hough, 1992; Tellegen, 1993; Tellegen, 
.firove, 8c Waller, 1991; Tellegen 8c Waller, 2000). The second criticism was that although 
(htscientiousness might be correlated with a wide range of work behaviors, it was not 
^ghly correlated with them. In addition, extraversion often correlated as highly with 
(■ehavior as did conscientiousness. A third criticism was that there were combinations of 
fbe five factors that led to greater prediaive power than any one of the factors by itself 
(Ihmn, 1993; Hogan 8c Hogan, 1989; Ones, Viswesvaran, 8c Schmidt, 1993). The first and 
*hird criticisms present an interesting dilemma, since one argues for more factors, whereas 
*he other seems to be arguing for fewer factors. 



3.2 Human Attributes 


Co nffl w rtwiisflgss QuaUty 
of having positive intentions 
and carrying them out with 
care. 












Chapter 3 Individual Differences and Assessment 


Functional pononality at 
work Thewaythatan 
individuat behaves, handles 
emotions, and accompbshes 
tasks in a work setting.- a 
combinabonofBig Five 
factors. 

Agreoabteness Likable, 
easy to get along with, 
friendty. 

Emotional stability 
Displaying IMe emotion; 
showing the same 
emotionat response in 
various sHuabons. 

Integrity Quality of being 
honest reliable, and elhicaL 


It does, however, appear that there are more than the originally proposed live factors. 
Roberts and colleagues (2005) argue that conscientiousness can be broken down further 
into three “subfactors” (industriousness, order, and self-control). Roberts and Mroczek 
(2008) suggest that extraversion can be further broken down into gregariousness and assertive¬ 
ness. Lee, Ashton, and de Vries (2005) propose that a dimension of “honesty-humility” 
needs to be added to the FFM. Some evidence (Marcus, Lee, & Ashton, 2007) suggests 
that this new dimension might be useful in predicting counterproductive work behavior 
such as theft. We believe that there is still valuable work to be done in identifying “the” 
relevant parameters of the personahty at work, and the FFM represents a useful point for 
departure. No one seems to disagree that the FFM contains the minimum number of 
relevant personality characteristics; the debate seems to be about the optimum number. 

Recently, Musek (2007) has suggested that there is really only one personahty factor: a 
combination of the Big 5. Naturally, he calls his theory the Big 1 theory. He proposes that 
this single factor represents all the things one would value in a personahty: conscientiousness, 
agreeableness, emotional stability, extraversion, and openness. It would be similar to the 
concept of “g” in cognitive ability. Perhaps it should be caUed “p.” Initial analyses make 
an interesting case for his Big 1 theory, but so far there are no data hnking one’s Big 1 
factor score to behaviors such as satisfaction, motivation, absence, counterproductive beha¬ 
vior, or technical job performance. But we are certain that a substantial amount of research 
wiU be done in the next few years testing Musek’s proposition. A single factor would make 
life considerably simpler for personahty assessment. If he is right, a large number of test 
vendors wih be unhappy, since they depend on many and nuanced personality facets to 
brand their personality tests. 

What seems to be true is that, although each of the five broad personahty factors does 
predict successful (in contrast to unsuccessful) performance of certain behaviors, some 
combinations of the factors may be stronger predictors than any single factor. This intro¬ 
duces the idea of a functional personahty at work (Barrick, Moimt, 8c Judge, 2001), mean¬ 
ing that not just one factor predicts success, but a combination of factors. For example. 
Ones and coUeagues (1993) found that individuals who were high on conscientiousness, 
agreeableness, and emotional stabihty tended to have higher integrity. Integrity in this 
context means being honest, rehable, and ethical. Dunn (1993) found that managers beheved 
that a combination of conscientiousness, agreeableness, and emotional stability made 
applicants more attractive to managers who had hiring responsibihties. Hogan and Hogan 
(1989) found that the same factors were related to employee reliabihty. In a review of meta¬ 
analyses, Barrick, Mount, and Judge (2001) confirm the importance of conscientiousness 
across a variety of occupations and performance measures. Emotional stabihty also 
appeared to predict overaU performance across occupations. Judge and Erez (2007) found 
that a combination of high emotional stabihty and high extraversion (which they labeled 
a “happy” or “buoyant” personahty) led to higher performance for employees and super¬ 
visors at a health and fitness center. In a recent meta-analysis, Clarke and Robertson (2005) 
find that low agreeableness, high extraversion, and low conscientiousness are related to 
accidents, both in occupational and traffic situations. They suggest that individuals low 
on agreeableness have more difficulty managing interpersonal relations, including foUowing 
group safety norms. 

Other meta-analyses also reveal relationships between the FFM and job performance, 
both in the United States (Hurtz 8c Donovan, 2000) and with European data (Salgado, 
1997, 1998). The latter series of meta-analyses suggest that, at least for many European 
countries, culture may not be a moderator variable for the personality/performance 
relationship. More recent research suggests that personality is a critical predictor for work 
behavior in Germany (Moser 8c Calais, 2007), Austraha (Carless et al., 2007), Thailand 
(Smithikrai, 2007) and the Netherlands (JQehe 8c Anderson, 2007a). Nevertheless, remember 
from Chapter 1 (e.g., Hofstede, 1980a, 2001) that cultural influences can be substantial 


3.2 Human Attributes 


113 


and that substantial differences exist between Western and Asian cultures. As examples, 
Tyder and Newcombe (2006) show that additional personahty characteristics such as 
*fece,” “graciousness versus meanness,” and “thrift versus extravagance might be neces¬ 
sary to describe the Chinese work personality. The importance of face (as in “to avoid los¬ 
ing fece”) had been shown in earher studies of Chinese students and managers by Cheung 
and colleagues (2001). It is tempting to recall the dramatic example from Chapter 1 of 
the Chinese manager who hanged himself, possibly because of a loss of face. Much more 
Bfesearch on the nature of the non-Western work personality is in order. As suggested by 
the work of McCrae, Terracciano, and colleagues (2005), there is reason to expect that 
Iptaiificant differences in work personality vnh be found in Asian societies as opposed to 
[fo^gs in Europe or the United States, if for no other reason than the emphasis on group 
^tcomes over individual outcomes in the collectivist cultures of China and Japan. 


^plications of Broad Personality Models 

As the aspect of work behavior we are trying to predict gets broader (e.g., overall job per- 
Hbrmance), large factors (e.g., conscientiousness) seem to be as predictive of behavior as 
are small er and more discrete factors. There is some debate about whether or not to use 


broad or narrow personahty dimensions (Hogan 8c Roberts, 1996; Ones 8c Viswesvaran, 
19%; Schneider, Hough, 8c Dunnette, 1996). It turns out that narrow traits seem useful 
for predicting very specific job behaviors (Dudley, Orvis, Lebeicki, 8c Cortina, 2006) and 
loader traits for predicting broader behaviors (Tett, Steele, 8c Beauregard, 2003). Each 
has its own use. In fact, there appears to be a movement to keep broad personality 
JiBodels from suffocating research on other possible personality characteristics (Borman, 


'2004a; Murphy 8c Dzieweczynski, 2005). Schmitt (2004) has suggested three promising 
Jttsonality characteristics: 


1. Core self-evaluation (Judge 8c Bono, 2001), a type of inner-directedness and sense 
of efficacy. 

2. Tolerance for contradiction (Chan, 2004), a way of thinking that prefers apparently 
contradictory information. 

3. Achievement motivation and aggression (Frost, Ko, 8c James, 2007; James, 1998), 
the tendency to aggressively seek out desired ends. 


We are not necessarily suggesting that these are “variables to watch we are simply echo¬ 
ing Schmitt’s sentiment that it is too early to close the door on variables beyond those in 
i>road personahty models such as the FFM. 

As we will see in Chapter 4,1-O psychology is becoming more specific in discussions 


of performance outcomes. Thirty years ago, most research and discussions would have 
Ndressed the issue of “overall performance.” Now discussion of performance includes spe- 
■^dfic aspects of work behavior such as citizenship behavior (e.g., volunteering, persisting), 
.^^Bchnical task performance, adaptive performance (adjusting to technical or procedural 
^fcwedictabihty in the work context), or counterproductive work performance. As it becomes 
.rSiore common to parse work performance into more discrete categories, narrower per- 
••onaUty characteristics may begin to show their value over broad dimensions such as those 


presented by the FFM. 

There is a final aspect of the research on personahty and work behavior that deserves 


Pscussion. Have you ever had a job in which you were closely supervised and required 
to follow very detailed work and organizational procedures? In that environment, you would 
^*ave had httle opportunity to show your “habitual way of responding” (i.e., your personahty). 
Think of the opposite situation—a job where you had a good deal of control over your 


tork habits. In the latter, you could reaUy be “you,” and whether you performed weU or 






Chapter 3 Individual Differences and Assessment 


3.2 Human Attributes 115 


Achievement A facet of 
conscientiausness 
consisting of hard work, 
persistence, and the desire 
to do good work. 

Dependahility A facet of 
conscienfaousness, 
consisting of being 
disciplined, well organued. 
respectful of laws and 
regulations, honest 
trustworthy, and accepting 
of authority. 


poorly probably depended on how well your personality was suited to the job’s demands. 
That is exactly what Barrick and Mount (1993) found with their research on the FFM. In 
jobs where the employee had a great deal of control (i.e., autonomy), personality was much 
more predictive of performance than in jobs where the employee had little or no control. 
Thus, control moderated the relationship between personality and performance. In 
statistical terms, control would be called a “moderator variable”—a variable that changes 
the nature of the relationship between two other variables. It has been commonly found 
that if a situation does not allow the person much discretion (referred to as a “strong” 
situation), personality will play a minor role in his or her behavior. 

To summarize what we know about the relationship between personality and work beha¬ 
vior, we believe the following conclusions can be drawn with confidence. 

1. Personality differences play an important role in work behavior independent of the 
role played by cognitive ability (Mount & Barrick, 1995; Murphy, 1996). 

2. Personality is more closely related to motivational aspects of work (e.g., effort expen¬ 
diture) than to technical aspects of work (e.g., knowledge components). Personality 
is more likely to predict what a person will do and ability measures are more likely 
to predict what a person can do (Campbell, 1990a: Mount & Barrick, 1995). 

3. The FFM is a good general framework for thinking about important aspects of per¬ 
sonality (Digman, 1990; Guion, 1998; Lubinski, 2000). 

4. The more relevant and specific the work behavior we are trying to predict, the stronger 
the association between personality and behavior (Mount & Barrick, 1995). 

5. Conscientiousness is best considered a combination of achievement and depend¬ 
ability. Achievement will predict some behaviors (e.g., effort) and dependability will 
predict other behaviors (e.g., attendance) (Hough, 1992; Moon, 2001; Mount & 
Barrick, 1995; Stewart, 1999). 

6 . Conscientiousness (along with its constituent fectors achievement and dependability) 
has widespread applicability in work settings. It is possibly the most important 
personality variable in the workplace and it may be the equivalent of “g” in the 
noncognitive domain (Schmidt & Hunter, 1992). 

7. Conscientiousness and its constituent factors (achievement and dependability) 
have a greater impact on behavior in situations where the worker has substantial 
autonomy (Barrick & Mount, 1993). 

8 . Conscientiousness, achievement, and dependability are only a small collection of 
a number of interesting facets of personality. The single-minded pursuit of “g” 
slowed down advances in understanding intelligence for almost 80 years. We should 
not let the same thing happen with the single-minded focus on conscientiousness 
(CoUins, 1998). 

9. There is evidence that factors other than conscientiousness have applicability for 
specific job families and occupations. Extraversion appears related to sales perform¬ 
ance; openness to experience predicts training and expatriate success; agreeable¬ 
ness is associated with performance in customer-service and team-oriented jobs; 
emotional stability contributes to a broad range of jobs including management posi¬ 
tions as well as jobs in the safety/security sector (Barrick, Mount, 8c Judge, 2001; 
Mount, Barrick, & Stewart, 1998; Vinchur, Schippmann, Switzer, & Roth, 1998). 

Personality Change over the Life Span 

One final caution needs to be made about personality as a construct. Unlike “g” or other 
abilities, personality does appear to change regularly over time. McCrae and colleagues 
(1999) found that there was a systematic decline in neuroticism, extraversion, and open¬ 
ness between college age and middle adulthood (30-49), whereas agreeableness and 


^nscientiousness increased during the same time period. Roberts and Mroczek 
(2008) show that the most substantial changes (generally improvements) in per- 
^nality occur between the ages of 20 and 40. Further, these positive changes con- 
^^^to occur (though less dramatically) into older age periods (e.g., 60-70), 
■prticularly in terms of agreeableness and conscientiousness. Perhaps Wal-Mart 
other retail giants are onto something in hiring retirees as greeters. For 

R faployers, this suggests that personality test scores should not be considered “con- 
^e.” Instead of relying on personality test scores that reside in a personnel folder 
from 5 or 10 years earlier, retesting mi^t be recommended when current 
fcnployees are being considered for promotions or special assignment. 

ADDmONALAHRIBUTES 


The collection of cognitive abilities, physical and motor abilities, personality, and 
•ests covers the major categories of proposed individual differences. The pat¬ 
terns formed by their combinations describe much of the variation among indi- 
[^uals. Nevertheless, some scientists propose additional aspects of individual 
:nces. 



iSkills 


Skills Practiced acts, such 
as shooting a basketbalL 
using a computer keyboard. 


practiced acts. Shooting a basketball, using a com- 
^ f puter keyboard, and persuading someone to buy something 
are all examples of skills. They come with hours, days, and 
■ iy ttreeks of practice. Skills also depend on certain abilities 
%e-hand coordination, or memory, or reasoning), per- 
r^^ljonality characteristics (persistence or agreeableness), 

snid knowledge (understanding the controls that activate a piece of equipment). 
jWdiough the skills depend on these other factors, the reason we call them skills is 
that they develop through practice. Technical and job-related skills are as varied 
as jobs and job tasks. There are other nontechnical skills that are more widespread 
titan any technical skill. Examples include negotiating skills, communication 
dolls, and conflict resolution skills. These three are often lumped together by non- 
^^•jrdiologists and called people skills. Since they come into 

play most conunonly in situations involving leader-follower People skills A 
and team member interactions, we will discuss them in the 
ters that deal with teams and leadership. 

conflict resoliition skilis. 

nowledge 


nontechnical tom that 
indudesnegotialing skills. 


fledge can be defined as “a collection of discrete but related facts and 

■ ^™miation about a particular domain. It is acquired through formal education 
.1 Or training, or accumulated through specific experiences” (Peterson, Mumford, 

■ Wrman, Jeaimeret, & Fleishman, 1999, p. 71). Many cities known for tourism (e.g., 
tL |ifondon, New York City) require taxi drivers to complete map tests demonstrat- 

I ■ . . .1 r . _ ^_J_D_an 



NONSEQUITUR 
BY WILEY 


Sometinies. the skill in reading a blueprint 
can be very important 
SouecE: NON SEQUITUR © 2004 Wiley Miller. 
Distributed by Universal Press Syndicate. Reprinted 
with permission. All rights reserved. 














Chapter 3 Individual Differences and Assessment 


3.2 Human Attributes 117 


BOX 3.4 THE SPREHATURA OF TEACHING GUITAR MAKING 


Frank Finocchio teaches novices to make beautiful 
guitars in a little workshop in Northeast Pennsyl¬ 
vania. Finocchio’s involvement with musical instru¬ 
ments began when he was 8 years old; one day his 
father placed a full accordion on his lap and told 
him to learn to play it. He did, but adds that he is 
still learning. As a young man in the Coast Guard, 
Finocchio spent hours in the ship’s machine shop 
learning to use the machine tools. He then worked 
as a machinist at a compressor manufacturing com¬ 
pany, and in 1989 was hired by the Martin Guitar 
Company to purchase exotic woods for guitar pro¬ 
duction. Soon he was teaching Martin customers 
how to make guitars. In 1998, he left Martin to 
begin teaching guitar making on his own. (You 
can read more at www.finocchioguitar.com/ 
home.html.) 



Guitar-making teacher Frank Finocchio practices the art of spremtura 
- making the difficult appear easy to his students. 


Finocchio describes his teaching style with an 
Italian word: sprezzatura —the art of making the 
difficult appear easy (to his students). He actually 
stretches the meaning to include transforming the 
actual work of the luthier from the difficult to the 
easy. Sprezzatura happens through the careful integ¬ 
ration of knowledge, skills, and abihty—^with a dash 
of personality thrown in. Frank developed a deep 
technical knowledge of both machine operations 
(lathes, drills, etc.) and wood properties (tonal 
qualities, reactions to woodworking tools) in his vari¬ 
ous occupations. He coupled that knowledge with 
skills peculiar to the luthier: the right angle and force 
level for chisels, the right amount of sanding at vari¬ 
ous stages, and so forth. To this, he added requi¬ 
site abilities, including finger dexterity, eye-hand 
coordination, and visualization (being able to 
imagine, even in the early stages of construction, how 
a finished guitar will look). Visualization is particu¬ 
larly important in determining how to correct a 
student’s error (e.g., a wrong cut, a wrong hole) with¬ 
out starting all over. Finally, he adds the personal¬ 
ity attributes of conscientiousness (particularly 
attention to detail), patience, and agreeableness. 
He guards against a strong achievement orientation, 
since he is dealing with students who do not have 
the potential for the perfection that he can achieve. 
In his low-key way, he helps them produce an 
instrument that is better than they could have ever 
imagined. 

In short, Frank Finocchio practices sprezzatura in 
his work and he accomplishes it with an admirable 
combination of knowledge, skills, abilities, and 
personality characteristics. 


ordinance that would require licensing of tour guides with particular attention to know¬ 
ledge of the city’s history (Associated Press, 2007). Knowledge is closely connected to skill 
when we are considering job-related skills (as opposed to psychomotor skills like shoot¬ 
ing a basketball). The knowledge of guitar-making principles interacts closely with the skills 
in tool use, as described in Box 3.4. Knowledge supports skill development and it comes 
in many varieties. It can be very basic (knowledge of mathematical operations or of vocabu¬ 
lary), or it can be sophisticated (knowledge of computer circuitry). Representative cate¬ 
gories of knowledge as identified in the comprehensive occupational information network 


t- 


BOX 3.5 AN EXAMPLE OF TACIT KNOWLEDGE 


IA postal worker gets on an elevator in a 25-story elevator, pushes the down button, and re-enters 

Bbuilding and pushes the button for the 18th floor. the elevator she just left making its way down from 

■ just before exiting the elevator at that floor, she the 25th floor. She has learned that if she does 

flushes the button for the 25th floor, puzzling not follow this routine, the elevator may not go 

Kfliose left on the elevator who are going no higher to the 25th floor and she may have to wait several 

■ than the 21st floor. The postal worker drops off mail minutes for another elevator to travel up the 18 floors 

f and picks up mail from a central location on the to retrieve her. This is tacit knowledge at its 

f 18th floor in less than 60 seconds, returns to the finest. 


that has come to be known as 0*NET are too detailed to present here, but they can be 
bfound in Peterson et al. (1999). The 0*NET architecture presents the name of the know- 
fcdge domain, the definition of the knowledge, and examples of what someone with a great 
deal or very little of the knowledge might be capable of doing. Perhaps the most imme¬ 
diate example of individual differences in knowledge is the distribution of test grades in 
your class. Although many variables may play a role in this grade distribution, one of 
those variables is certainly knowledge of the course material as presented in the text and 
^Pectures. 

' ‘^ Another kind of knowledge that has been proposed is called tacit knowledge, studied 
by Sternberg and his colleagues (Sternberg 8c Wagner, 1986; Sternberg, Wagner, 8c 
igaki, 1993). They distinguish between “academic” and “tacit” knowledge, the latter 
Tibed as “action oriented knowledge, acquired without direct help from others, that 
fallows individuals to achieve goals they personally value” (Sternberg, Wagner, Williams, 
8c Horvath, 1995). They describe tacit knowledge as “knowing how” rather than “know¬ 
ing that.” Box 3.5 provides a practical example of tacit knowledge. A more formal way of 
Inguishing these two types of knowledge is procedural knowledge (knowing how) in 
trast with declarative knovdedge (knowing that). Interestingly, Rapp, Aheame, Mathieu, 
and Schillewart (2006) find that pharmaceutical sales representatives work harder when 
they have high levels of declarative knowledge but lower levels of experience (which would 
eventually lead to procedural knowledge). 

These researchers give an example of how tacit knowledge about getting along with your 
boss might affect your behavior. If you need to deliver bad news, and you have reason to 
[•' believe your boss is in a bad mood, tacit knowledge would teU you that it would be best 
to deliver the bad news later. A common nonscientific term for tacit knowledge might be 
J “street smarts.” One of the important distinctions researchers make between formal or 
f pttademic knowledge on the one hand and tacit knowledge on the other is that tacit know- 
I ledge is always goal-directed and useful, while academic knowledge may not be. People 
Itiievelop tacit knowledge about environments and processes that are personally valuable 
: to them. Research suggests that tacit knowledge is something above and beyond intelli- 
■ gence (Sternberg et al., 1995). Learning little tricks to perform better might be considered 
the light side of the tacit knowledge coin, and learning how to manipulate people might 
be the dark side. Knowledge, particularly tacit knowledge, is often thought to accumulate 
i as a result of experience. We address the possible role of experience in assessment on the 
I text website. 


OccupatioiMl Infomialion 
NetworidO'NET) CoUection 
of electronic databases. 


updated and replaced the 
Dictionary of Occup^mal 

mesmi). 


Tacit knowledge Action- 
oriented. goat-directed 
knowledge, acquired 
without direct help from 
others; cotloquialty catted 
"street smarts." 

Procedural knowledge 
Familiarity with a procedure 
or process: knowing "how." 

Declarative knowledge IDK) 
Understanding what is 
required to perform a task; 
knowing infonnation about 
a job or job task. 





8 Chapters Individual Differences and Assessment 


3.2 Human Attributes 119 


Competondes Sets of 
behaviors, usuatty learned 
by experience, that ace 
instrumental in the 
accomplishment of desired 
organizabonal results or 
outcocnes. 


Job analysis Processthat 
determines the important 
tasks of a job and the 
human attributes necessary 
to successftilly perform 
those tasks. 


Emotional intelligence (El) 

A proposed kind of 
inteUigence focused on 
people’s awareness of their 
own and others' emotions. 


Competencies 


I-O psychologists talk about combinations of knowledge, skills, abilities, and other per¬ 
sonality characteristics (KSAOs) in terms of competencies. Kurz and Bartram (2002) have 
defined competencies as “sets of behaviors that are instrumental in the delivery of desired 
results or outcomes.” Following from that definition, it is reasonable to assume that 
people differ in the extent to which they possess various competencies. But competencies 
are different from knowledge, or a skill, ability, or personality characteristic, in that a com¬ 
petency is really a collection of aU of these specific individual difference characteristics. 
The essence of a competency is the combination of these characteristics and is not domin¬ 
ated by any one (Harris, 1998a). We will review a model of competencies called “The Great 
Eight,” as proposed by Bartram (2005), in Chapter 4 when we discuss performance 
models. 

Competencies are unique in another way as well. Abilities can be defined and measured 
in the abstract, as can personality characteristics. But competencies only have meaning 
in the context of organizational goals. For example, you could distinguish between two 
individuals based on their measured conscientiousness, their reasoning ability, or their skill 
with a word processing program. But a competency of organizing and executing a busi¬ 
ness plan would require a combination of these three individual elements, in addition to 
various aspects of technical and procedural knowledge (Kurz & Bartram, 2002), and would 
have relevance only to that series of actions. Thus, competencies are really collections 
and patterns of the individual difference attributes we have already covered, rather than 
separate characteristics. We will return to competencies and how they are identified 
(competency modeling) in Chapter 4, as a new way of thinking about analyzing jobs—a 
process called job analysis. 

Emotional Intelligence 


In the 1980s Howard Gardner (1983,1993) proposed a novel theory of intelligence. Rather 
than a unitary approach to intelligence such as “g,” he posited seven different types of 
intelligence, including logical-mathematical, bodily-kinesthetic, linguistic, musical, spatial, 
interpersonal, and intrapersonal. He described the latter two intelligences as follows: 

Interpersonal intelligence is the ability to understand other people: what motivates them, 
how they work, how to work cooperatively with them.... Intrapersonal intelligence, a 
seventh kind of intelligence, is a correlative ability turned inward. It is a capacity to form an 
accurate veridical model of oneself and to be able to use that model to operate effectively in 
life. (1983, p. 9) 

Gardner’s notion of inter- and intrapersonal intelligence was popularized by Goleman 
(1995a) using the label emotional intelligence (El). El is a relatively new concept with 
little in the way of an empirical data base at this point, but two questions about it have 
emerged. The first is whether this actually represents a kind of intelligence, a skill devel¬ 
oped and honed with practice, or a personality characteristic (Barrett, 2001). Mayer, Roberts, 
and Barsade (2008) have recently defined El as “the ability to carry out accurate reason¬ 
ing about emotions and the ability to use emotions and emotional knowledge to enhance 
thought” (p. 507). The second question is how to measure El; two different approaches 
have been proposed. One has been labeled the “mixed” approach and addresses El as a 



mality characteristic. The other is called an ability approach and measures El as any 
^er measure of cognitive ability. 

In many respects, this becomes more a semantic battle than a theoretical one. Never¬ 
theless, the studies that have been done on the construct have been disappointing, failing 
to ijdentify El as something different from attributes with which we are already familiar 
(Davies, Stankov, & Roberts, 1998; Roberts, Zeidner, & Matthews, 2001). Van Rooy, 
^ftivesvaran, and Pluta (2005) report considerable overlap between the mixed model and 
.^fcsonality measures, and similar overlap between the ability model and measures of cog- 
mtive ability. This type of overlap has led critics to question whether El is really a new 
'human attribute. A recent review of the two models (Mayer et al., 2008) finds little sup¬ 
port for the mixed model as opposed to the ability model. In 2005, the Journal of 
wikfinuational Behavior devoted an entire section to the debate (Ashkanasy & Daus, 2005; 
Conte, 2005; Daus & Ashkanasy, 2005; Tandy, 2005a; Locke, 2005; Spector, 2005). In addi¬ 
tion, Murphy (2006) has published what is destined to be the “authoritative” volume on 
El. There are vigorous advocates on both sides. 

(Lyusin (2006) provides a more refined view of El. He suggests that the value of El in 
icting behavior is related to the extent to which the individual is in touch with the 
lal world of emotional reality. Thus, it is possible for an individual to be emotion¬ 
ally intelligent in the sense that he or she can correctly classify emotions, but to have httle 
or no interest in acting on that classification. Nonpsychologists might simply label this 
type of person as unemotional or “cold.” Lyusin proposes that El is a combination of cog¬ 
nitive ability (including speed of information processing), ideas about emotions (i.e., the 
Ulportance attributed to them as potentially useful information), and emotional stabil¬ 
ity. A recent review of the results of El studies in field settings (Mayer et al., 2008) con- 
, dudes that there are small but significant relationships between El as defined by the ability 
ich and outcomes such as team behavior and satisfaction with negotiation outcomes, 
(rtimately, many of these studies do not adequately control for either personality char¬ 
ities, “g,” or both, so we are still left to wonder about the unique contribution of El 
to work behavior. To be fair, we are only beginning to understand team and work group 
|tdiavior. It may very well be that when we have a better understanding of team and work 
group dynamics, El will begin to play a larger role in predicting valuable work outcomes— 
but we are far from that point right now. 


Constnict Psychological 
concept or chatacteristic 
that a predictor is intended 


inteUigence. personality, 
and leadership. 


[module 3.2 SUMMARY 


' Fleishman and his associates developed a tax¬ 
onomy of 52 abilities, divided into the broad 
categories of cognitive, physical, and perceptual- 
motor abilities. 


• “Intelligence” (or “g”) is a very general mental 
capability that describes a person’s ability to 
learn from experience. 


Meta-analyses of the relationship between “g” and 
job performance demonstrated that the more 
complex the job, the stronger the predictive 
value of general intelligence tests. 


• Carroll proposed that intelligence had three lay¬ 
ers, or strata. The highest layer is “g”; the next 
layer down consists of seven more specific abil¬ 
ities: fluid intelligence, crystallized intelligence, 
memory, visual perception, auditory perception, 
information retrieval, and cognitive speed. 

• Physically demanding jobs require strength, 
flexibility, and stamina or aerobic endurance. 
Hogan proposed a seven-measure taxonomy of 
physical abilities, and combined these seven 
measures to form three higher-order physical 












Chapter 3 Individual Differences and Assessment 


abilities: muscular strength, cardiovascular 
endurance, and movement quality. 

It is important to determine whether employers’ 
physicd ability tests are fair to female appU- 
cants and older applicants, since both of these 
groups tend to have less strength than young men 
do. One way of enhancing the performance of 
females and older applicants on these tests is to 
encourage applicants to train ahead of time. 
It is also important that these tests relate to job 
performance prediction rather than injury 
prediction. 

There are clear connections between aspects of 
personality and various work behaviors, both pro¬ 
ductive (e.g., job performance) and counter¬ 
productive (e.g., dishonesty, absenteeism). I-O 
psychologists studying personality use a taxon¬ 
omy labeled the Big 5 or the Five Factor Model 
(FFM). 

Of these five factors, the one that has attracted 
the most attention from I-O psychologists is 
conscientiousness. Barrick and Mount concluded, 
on the basis of a meta-analysis, that conscien¬ 
tiousness was positively related to success in all 
aspects of work for all occupations. 

Barrick and Mount found through FFM research 
that in jobs where the employee had a great deal 
of control or autonomy, personality was much 


more predictive of performance than in jobs 
where the employee had little or no control. 

• Skills are practiced acts. Although skills depend 
on ability, personahty, and knowledge factors, 
what makes us call them skills is that they 
develop through practice. 

• Knowledge can be defined as “a collection of dis¬ 
crete but related facts and information about a 
particular domain. It is acquired through formal 
education or training, or accumulated through 
specific experiences.” Another proposed kind of 
knowledge is tacit knowledge, described as 
“knowing how” rather than “knowing that.” A 
more formal way of distinguishing these two 
types of knowledge is procedural knowledge 
(knowing how) compared with declarative 
knowledge (knowing that). 

• Competencies are “sets of behaviors that are 
instrumental in the delivery of desired results or 
outcomes.” Competencies are different from 
knowledge, or a skill, ability, or personahty 
characteristic, in that they are really a collection 
of all of these specific individual difference 
characteristics. 

• Those who invoke the concept of emotional 
intelhgence suggest that there is a imique kind 
of intelhgence that is focused on our awareness 
of our own and others’ emotions. 


KEY TERMS 


taxonomy 

perceptual-motor abihties 
affect 

IQ 

Intelhgence Quotient 
meta-analysis 
Flynn effect 
mean 

standard deviation 
muscular tension 
muscular power 
muscular endurance 
stamina 


sensory abihties 
Americans with Disabihties 
Act 

psychomotor abihties 
Big 5 

Five Factor Model (FFM) 

conscientiousness 

functional personahty at work 

agreeableness 

emotionahy stabihty 

integrity 

achievement 

dependabihty 


skihs 

people skihs 

Occupational Information 
Network (O’^NET) 
tacit knowledge 
procedural knowledge 
declarative knowledge 
competencies 
job analysis 

emotional intelhgence (El) 
construct 



The Past and the Present of Testing 


Yvonne felt as if she had been preparing for this day forever. There had been simdar days, 
sure: the SAT exam to get into coUege, and the civh service test she to get h« sum¬ 
mer job in the State Personnel Department. But this was show time. A hi^ score 
smuld be the ticket she needed for getting into a good graduate progr^^ ^ 

«actly the problem. Yvonne choked up on standardized test^always and probably 
I dways woifid. Even though her SAT score had been low, she wodd 
t overaU GPA and a 3.5 in her major. But getting into graduate school was not 

as easy as it had been to qualify for her undergraduate prog^^ The thmg that re^y ^oyed 
her was that these tests measured such a narrow band of who she was and what he p 
labilities were that it was a joke. How would they know that Yvonne was 
rttendly, and had learned to read music in a weekend? Did they even care that she took 
hard courses rather than “cruisers”? She understood that there had to be some st^d 
way of selecting among applicants, she just wished that it was not a 
^Society seem^o have a love-hate relationship with psycholo^cal testog, ^ ^ 

as old as psychology itself The term “mental test” was introduced by Cattell in 1890. A 
we described in Chapter 1, in the First World War over a mifiion soldiers were test^ m 
ittelligence in order to determine which were best suited to be officers and ~ 
Etry. Up to that point, intelligence testing had been done on an individual basi^ and 
this first trial of group testing was considered a massive success for the “S 

But with this success came an embarrassment; soon after the war, psyc o gi g 

began to be used as the justification for limiting immigration. The army testing program 
Ibcovered that immigrants and their offspring, who did not speak Enghsh as a first lam 
4 J 8^e, scored lower on these intelhgence tests. Fearing that unchecked 

leduce the national intelhgence level. Congress enacted immigration quotas, ^^hou^ social 
- critics were quick to point out the potential unfairness of intehigence testmg, advocates 
saw it as a way to avoid the class system that had characterized industry and education in 
the 19th century. In their view, a test was “objective” and thus fireed decisions (about jobs 
or education) from the grasp of favoritism and nepotism. 

L 0 Private industry, like the government, was impressed by the success of the army test- 
Ijng programs and moved to implement testing as a way of selecting the most Promising 
“-^dates fi-om a pool of job apphcants. Soon, however, the Great Depression of the 1930s 



d 





Chapter 3 Individual Differences and Assessment 


arrived, drastically reducing the need to selert from an applicant pool. There were no jobs 
to be had. When America entered the Second World War, the country returned to a full 
employment mode and virtually every able-bodied and motivated worker, male or female, 
either had a job or was serving in a branch of the armed services. Ships and airplanes were 
being built in record numbers, requiring one of the first “24/7” industrial environments. 
Now there was no need for selection for the opposite reason: There were many more jobs 
than people. 

On the military front, commanders quickly realized that war was now much more tech¬ 
nologically advanced than it had been a generation earlier. Personnel needed to operate 
many different types of aircraft and ships with complex maintenance and repair demands. 
The task of the armed forces was no longer simply distinguishing between officers and 
infantry. The war effort needed pilots, bombardiers, artillery personnel, radar and sonar 
operators, and an enormous training and administrative staff. Psychological testing was 
once again pushed to the forefront as a tool in the war effort, this time with more sophis¬ 
ticated tests for the placement of recruits. 

By the end of the Second World War, test developers had virtually glutted the market, 
offering ability, personality, interest, and knowledge tests. Neither the government nor the 
psychological profession exercised much control over the quality of the tests or the mean¬ 
ing of the test scores. A thriving and competitive testing industry operated without con¬ 
straint until the early 1960s, when two societal forces converged to rein in testing. The 
first was a new wave of criticism about the value of testing from social observers (Gross, 
1962; Whyte, 1956). These critics pointed out that employers were expecting job appli¬ 
cants to submit to a range of tests that had little apparent relationship to the job for which 
they were applying. Many of the tests, particularly the interest and personality tests, asked 
questions of a personal nature—topics like religion, sex, and politics. The second force 
was the passage of the Civil Rights Act of 1964, which prohibited discrimination in 
employment, including testing. If a test had the effect of reducing the employment oppor¬ 
tunities of protected subgroups (e.g., African Americans, women), then the employer would 
need to provide evidence of the validity of that test. Since many of the tests available at 
that time had little validity evidence, employers saw this as a difficult hurdle to overcome. 

As a result of the questions about the invasion of privacy and the possible discrimina¬ 
tory effects of tests, there was a marked reduction in test use for selection purposes, par¬ 
ticularly intelligence and personahty tests. The reticence lasted well into the 1970s, by which 
time more evidence of validity for tests had become available and the courts had clarified 
what was acceptable evidence for validity. At this time, research began to emerge show¬ 
ing that tests of cognitive ability were just as vahd for minority test takers as for majority 
test takers. By the mid-1980s, testing was back in full swing and both intelligence and per¬ 
sonality testing began to appear with greater frequency. 

As we will see in the modules that follow, the content and process of employment test¬ 
ing is varied and encouraging. I-O psychologists have identified many different attributes 
that appear to contribute to work performance. Furthermore, I-O psychologists have identified 
many different methods for assessing these attributes. 

But concerns about the “fairness” of testing continue to arise in many different settings. 
To mention just a few, some universities have decided to abandon standardized testing 
for apphcants and introduce nonstandardized techniques that will permit motivation, inter¬ 
ests, and values to play a greater role in student admissions. In both teacher and student 
testing in K-12 environments, there is a vigorous debate—and occasional lawsuits (e.g., 
Gulino et al. v. Board of Education of the New York City School District of the City of New 
York and the New York State Education Department, 2002)—about the value of standardized 
tests for teacher certification and the awarding of high school diplomas. For example, many 
school districts require the successful completion of a series of content-specific tests (e.g., 
in mathematics or biology) as well as more general tests (e.g., knowledge of hberal arts) 


3.3 Foundations of Assessment 


123 


)^iefore granting teachers a permanent teaching certificate. In response to scandals such as 
■ * ^ the Enron and WorldCom accounting fraud cases, MBA programs began considering the 
use of new “tests” of ethics, morality, and integrity to determine whom to admit to their 
MBA programs (Jackson, 2002). 

- i ‘ Pnderlying all of these debates, the issue of fairness remains; Are standardized tests both 
^ ^ /Fj pfr rrive and fair instruments for selecting among individuals? For every standardized test, 

1 l^Vere will be critics suggesting that the standardization prevents an illumination of the 
'! ^ fcssence” of the person. For every nonstandardized suggestion, there will be critics who will 
I '5 ' ^le that the lack of standardization permits favoritism. Psychological testing will always 
have a values component to it in addition to the issues related to content and process. 

1 » What Is a Test? 

I . 

I Robert Guion (1998) defined a test as “an objective and standardized procedure for Tut An obied^ 

I ^BKasuring a psychological construct using a sample of behavior” (p. 485). Seventy years 
; Vwlier, Clark HuD (1928) had proposed a virtually identical definition. Few definitions in a ^ of 

Wjfcychology have remained so constant for such a long time. One of the appealing char- behavior. 

* ^[eristics of this definition is that it is broad enough to cover a wide variety of tests and 
l^g procedures. It encompasses paper and pencil tests, Internet testing, interviews, actual 
Httempts to perform a piece of work (a work sample test), and even an application blank. 

The definition is also broad enough to cover many different types of content, including 
Kognitive ability, personality, values, communication skills, interpersonal skills, and tech- 
dcal knowledge. In the modules that foUow, we will review various content categories, as 
well as various techniques for assessing that content. As an example, if we were interested 
in the technical knowledge of an applicant for a word processing position, we could give 
the appUcant a paper and pencil test and an interview, check with previous employers, 
have the applicant complete an actual word processing task at a workstation, or examine 
the applicant’s formal education credits. Each of these techniques could be used to assess 
the same attribute; technical knowledge. Similarly, we might be interested in a number of 
pfifferent attributes of the applicant beyond technical knowledge, including communica¬ 
tion skills, personality characteristics, interests, integrity, 

and career plans. We might use one or more inter- ^butes MeDiods of Assessment 

tnews to assess each of these additional attributes. As Paper and pencil test lnt«view 

you can see from Figure 3.4, in most practical test¬ 
ing situations, we are looking at the combination of 
ptttributes to be assessed (content) and ways to assess Reasoning 
ffiose attributes (process). Most employers look at 
•everal attributes using several techniques. Earlier in 
this chapter, we introduced the initialism KSAO social skills 

Blnowledge, skill, abihty, other characteristics) to 
dtaunarize the attributes of a worker. In one way or 
toother, every test is an assessment of one or more 

of these content areas. fIGURE 3.4 Two Attributes Measured Using Two Different Procedures 

What Is the Meaning of a Test Score? 

As Guion (1998) suggested, the term “objective” in his definition of a test implies 
^^Plttltotification—some kind of score on the test. It may be a simple pass-fail score (e.g., 
may pass or fail a driver’s license examination) or a score on some graded con- 
^^B^Pfatuum (such as an 88 percent or a B-t-). But the simple process of assigning a score is quite 









Chapter 3 Individual Differences and Assessment 


different from interpreting the meaning of that score. For example, if your instructor curves 
exam scores, and the exam was a tough one, an 88 might be in the A range. If, on the 
other hand, the test was an easy one and virtually everyone got a 94 or above (except you), 
your 88 might be in the B range or lower. 

Meaning is usually assigned to test scores through a process known as norming. 
Norming simply means comparing a score to other relevant test scores. In many employ¬ 
ment settings, we compare individuals to one another, so the rules we use for mddng these 
comparisons should be unambiguous and fair. Test scores are often interpreted relative 
to some set of norms. In the classroom example above, your score of 88 percent is given 
meaning, or interpreted, by comparing it to the grades of your fellow students (the norm 
group). Instead of being compared to others in your class who took the same test you 
did, the instructor could have compared your score (and the scores of your classmates) 
to those of earlier classes who took midterms in the same content area. Or the instructor 
may not have curved the test at all but held to some previously determined comparison 
scale (90 to 100 percent = A; 80 to 89 percent = B, etc.). The development of test norms 
is very technical; excellent discussions of the process are presented in Guion (1998) and 
Cohen and Swerdlik (2002). For our purposes, it is simply important to be aware that 
while a test produces a “score,” there is a need to interpret or give meaning to that score. 
As you will recall from our earlier discussion of validity in Chapter 2, validity is about 
inference: What can we infer from a test score about future performance? The meaning 
of a test score is a question of validity (Messick, 1995). 

Test Users and Test Interpretation 

The issue of validity and meaning of a test score brings us to the more practical issue of 
who will interpret the test. Suppose you had been in an auto accident and suffered a head 
injury, your doctor referred you to a neurologist for testing to look for any resulting impair¬ 
ment. Suppose further that the results of that testing arrived in the mail filled with num¬ 
bers, diagnostic categories, and technical descriptions of the results of that testing. After 
poring over the results for an hour, you still might not know whether your injury caused 
lasting impairment. You would need formal training in neurology to understand the mean¬ 
ing of the numbers and narrative from a standard neurological test battery. 



SamcE: © The New Yorker Collection 1987 Dana Fradon from cat1oo(ibank.com. All rights reserved. 


Norming Comparing a test 
score to other relevant test 
scores. 


Norm group Group whose 
test scares are used to 
compare and understand an 
individual’s test score. 


3.3 Foundations of Assessment 


125 


1 


3,2 Twelve Minimum Competencies for Proper Use of Tests 


Avoiding errors in soiring and recording. 


Refraining from lAeting people with personally derogatory terms like dishrmest on the basis of a test 
score that lacks perfect validity. 


Iteeping scoring keys and test materials secure. 

Seeing that every examinee follows directions so that test scores are accurate. 


Using sdtings for testing that allow for optimum performance by test taters (e.g.. aderiu* mom). 


Refraining from coaching or training individuals or groups on test items, which results in 
misrepresentation of the person’s abilities and competencies. 


Willingness to give interpretation and guidance to test takers in counseling situations. 
Not making photocopies of copyrighted materials. 


Reffatning from using homemade answer sheets that do not alip property with scoring keys. 


Establishing rapport with examinees to obtain accurate scores. 


Refraining from answering questions from test lakers in greater detail than the test manual permits. 


Not assuming that a norm for one job applies to a different job (and not assuming that nonns for one 
group automatically apply to other groups). 


1 


Similarly, an individual needs to have formal training in psychological assessment to 
know how to interpret the results of many psychological tests. Furthermore, individuals 
lack suitable training are prone to making erroneous interpretations and, consequendy, 
|^l?>propriate decisions and actions. Fortunately, several documents are available that spell 
out proper and ethical procedures for test score interpretation and use (e.g., American 
^facational Research Association et al., 1999; Moreland, Eyde, Robertson, Primoff, & Most, 
1995; Society for Industrial and Organizational Psychology, 2003; Turner, DeMers, Fox, 
& Reed, 2001). Table 3.2 presents a list of the competencies expected of those respons¬ 
ible for administering and interpreting psychological tests. As you can see, psychological 
r testing, if done ethically and effectively, is no simple process. The advent of the Internet 
bas led to an explosion of cheap and available “tests.” Many of these tests have been devel- 
^>ed and are being administered by individuals who fall far short of demonstrating the 
i^ttpetencies recommended by Table 3.2. This is where the I-O psychologist can help the 
ustrial cHent navigate through the Internet testing swamp. The principles for develop- 
•og and administering Internet tests are basically the same as for more traditional tests— 
Ibere are just many more of these tests, they are poorly documented, and they often have 
research to support them. 





Chapter 3 Individual Differences and Assessment 


What Is a Test Battery? 

TKlIialltry Colledion of A test battery is a collection of tests, usually of different attributes. These attributes may 

teste Dial usuatty assess a be within a single area, such as a cognitive battery including subtests of reasoning, mem- 

ory, and comprehension; or the attributes may be from conceptually different areas, such 
as a battery that includes a measure of cognitive abihty, a personality test, a physical abil¬ 
ity test, and a test of vocational interests. The term “battery” usually implies that all of the 
tests will be taken either in a single testing period or over a very short period of time. But 
whether the information being considered is from several different assessment devices admin¬ 
istered at one time or over a lengthy period of time, the critical issue is how to combine 
that information. Will it be combined to yield a single score with weights assigned to indi¬ 
vidual tests using a statistical equation of some type, or will the evaluator combine the 
individual test scores using a logical or nonstatistical process to yield a final recommen¬ 
dation? We considered the issue of statistical combination in Chapter 2 in the section on 
regression, but we will consider the broader issue of how test information may be com¬ 
bined at greater length in Chapter 6 when we deal with staffing decisions. 

Where to Find Tests 

At various points in the text, we mention some specific tests by name. There are literally 
thousands of psychological tests available on a broad range of topics. Textbooks on test¬ 
ing provide lists and examples of tests. For example, Anastasi and Urbina (1997) presented 
an extensive Hst of tests covering a range of topics, as well as a fisting of test publishers. 
A more complete fisting of tests, as well as reviews of those tests, can be found in two 
established sources. The first is known as the Mental Measurements Yearbook (MMY). 
This was first published in 1938 and has been updated an additional 16 times. The 17th 
edition (Geisinger, Spies, Carlson, & Flake) was published in 2007. Buros Institute 
(named after the founder of the MMY, Oscar K. Buros) also publishes a companion 
volume without reviews called Tests in Print. The most recent edition of this companion 
text was released in 2006 {Tests in Print VII, Murphy, Spies, & Flake, 2006). Table 3.3 pre¬ 
sents a typical entry in the MMY. 

Thomas (2004) has published an excellent handbook about tests that are available for 
use in business settings. It covers both the content and format of all categories of mod¬ 
ern industrial assessment and provides in-depth reviews of both devices and assessment 
practices generally. It is a valuable resource for those with an interest in industrial assess¬ 
ment and a bit more focused than the other sources fisted above because they deal with all 
tests (e.g., counseling, educational) rather than tests used exclusively in a business setting. 

Administrative Test Categories 


In descriptions of tests and testing, you may encounter several terms that require a brief 
explanation. 

Speed versus Power Tests 

Some tests have rigid and demanding time limits such that most test takers will be unable 
to finish the test in the allotted time. These are called speed tests. As Murphy and 
Davidshofer (2005) pointed out, if someone scores poorly on a speed test, it is not clear 
whether the person actually knew the answers but could not respond quickly enough, or 


S|M<dtest Has rigid and 
demanding time limits so 
most test takers will be 
unable to finish fire test in 
the allotted time. 


AtenWiVoaswMMote 
Yettiook Widely used 
source that includes an 
extensive listing of teste as 
well as reviews of those 
tests. 


3.3 Foundations of Assessment 


127 



/^500 


Cl 


jjIBiE 3.3 Short Empbyment Tests. Second Edition 


;&:Vei1iaL Numerical CtericaL Total 


: 2 forms of each subtest (Verbal Numerical Clerical Aptitude). 


166; $249 per starter kh. including 25 each of the Verbal Nurnerical and Clericat Aptitude test booldets. scori 
I manual $95 per package of 25 test booklets (qiecify version and Form 1 or 2); per package of 100 test 
i (specify version and Form 1 or 2); $30 per coinbination scoring key; $45 per ntanual 


I Distribution of Form 1 restricted to banks that are menibeis of the American Banking Association. 


: George K. Bennett and Marjorie Gelink. 


; Harcourt Assessment Inc. 


Wfould have been unable to answer correctly no matter how much time was allotted. Power 
tests have no rigid time limits. While some test takers may still not finish, enough time is given 
for a majority of the test takers to complete all of the test items. The items on power tests tend 
to be answered correctly by a smaller percentage of test takers than those on speed tests, 
i^^ssessment professionals find that speed tests provide greater variability among candi¬ 
dates, allowing for more effective prediction, 

. they carry some vulnerabilities. The most 
pi>vious whether the actually 

■Quires for successful perfor- 

I foance. Few jobs have demands. The 

ly^ nd potential pitfall is the possibility of I 

[ptitxlucing unfairness to the testing process 


Powcrtasi Has no rigid 
lime limite; enough time is 
given for a majority of the 
test takers to Cornwall 
of the test rtems. 


uects of the aging process a decline 

•fonnation-processing speed. As we age, we VI 

longer to complete cognitive opera- ■HJain ■■ 
ons. In many instances, this slowing process uBM 

irrelevant to the actual demands of a job; 

matter a worker or 

*conds rather than 3 seconds to accomplish 

■ fosk. As we saw in Box 3.3, in terms of Tbe television show Jeopan/y is an example of a speed test 













Chapter 3 Individual Differences and Assessment 


simple reaction time, the difference between an “old” person and a “young” person is as 
little as l/20th of a second! Nevertheless, there are some professions (e.g., airline pilot, 
police officer, firefighter, bus driver) where speed of information processing or time to 
completion of an action might be critical. Disabled individuals, particularly those with learn¬ 
ing disabilities, may also find themselves at a disadvantage on a speed test. One of the 
most common requests for a testing accommodation made by individuals under the 
Americans with Disabilities Act (1990) is for additional time to complete a test. Thus, speed 
tests may increase the risk of legal challenge from many groups unless it can be shown 
that the type of speed required by the test is also required by the job. 

Group versus Individual Tests 

Most standardized written tests, even if administered to single individuals, could be 
administered in group format. A cognitive ability test could be given to 20,000 police academy 
candidates in a convention center, or individually in a room on an army base where an 
officer candidate is stationed. Group tests are efficient because they allow for the testing 
of many candidates simultaneously, resulting in rapid screening compared to individually 
administered tests. Group testing is also often valuable in reducing the costs (in both time 
and money) of testing many applicants. As we will see shortly, Internet testing involves a 
virtual group rather than a physical group. 

Certain tests, however, can be given only on an individual basis. Examples include an 
interview, a test of hand-eye coordination, or an elaborate assessment of candidates for a 
high-level executive position based on interviews, work samples, and individually admin¬ 
istered personahty tests. Individual tests are also often more appropriate when the 
employer wishes to assess a candidate’s style of problem solving rather than the simple 
products of the problem-solving process. Individual testing formats are also appropriate 
when the examiner needs to establish an interpersonal rapport with the test taker. 

Paper and Pencil versus Performance Tests 

Paper and pencil tests are one of the most common forms of industrial testing. By exten¬ 
sion, the modern version of the paper and pencil test might be the computer keyboard 
test where the keys and mouse are used only to choose the correct response or produce 
a narrative response to a question. Given the increasing popularity of computer- and 
Internet-administered tests, it might be better to adopt a term other than “paper and pen¬ 
cil testing”; a distinction such as nonmanipulative versus manipulative might be more apt. 
We will discuss computer and Internet testing later in this chapter. 

Performance tests require the individual to make a response by manipulating a particu¬ 
lar physical object or piece of equipment. The score that the individual receives on the 
test is directly related to the quality or quantity of that manipulation. An example might 
be a test administered to a candidate for a dental hygienist position. The candidate might 
be asked to prepare a tray for cleaning or scaling teeth, to prepare a syringe of novocaine 
for administration by the dentist, or to prepare a mold for taking an impression of a row 
of teeth. In this case, the candidate’s skill in performing these tasks may be as important 
as his or her knowledge of how to carry out the actions. 

Testing and Culture 


In the 1950s and 1960s, testing was largely lacking in controls, either legal or professional. 
As social critics pointed out, the quality of tests was therefore variable, and the potential 
for cultural influence and bias was substantial. An example would be a test that used a 


Paper and pencil test One 
of the most common forms 
of industrial testing that 
requires no manipulalion of 
any objects other than the 
instniment used to respond. 


Pwfonnancetest Requires 
theindividuattomakea 
response by manipulating a 
parbcular physical object or 
piece of equipment 


Oroupfost Can be 
administered to large 
groups of individuals: often 
valuable in reducing the 
costs (both in bme and 
money) of testing many 
applicants. 


Individual test Test given 
only on an individual basis. 


3.3 Foundations of Assessment 


129 




level of vocabulary to assess a relatively simple and straightforward skill. Instead 
ing “How much is two plus two?” the item might have read, “If one were asked to 
ate the arithmetic sum of the two integers that have been reproduced below, what 
uid the resultant number be?” The second item would surely be more difficult for some- 
with a limited vocabulary or low reading comprehension to answer, even though both 
are ostensibly assessing the same basic skill—arithmetic proficiency. Modern tests 
diminated most if not all of these reading level problems. What they may not have 
done, however, is to eliminate cultural influences. 

yiirphy and Davidshofer (2005) distinguished among three terms in discussing tests 
and testing: bias, fairness, and culture. They rightly pointed out that bias is a technical 
and katistical term that deals exclusively with the situation in which a given test results in 
diors of prediction for a subgroup. Thus, if a test underpredicts the job performance of 
len (i.e., predicts that they will score lower on some criterion than they actually do) 
overpredicts the job performance of men (i.e., predicts that they will score higher on 
criterion than they actually do), then the test would be said to be biased. You will 
iber that earlier in this chapter, we described a case involving a strength test for female 
icants in a meat packing plant. In essence, the judge in that case ruled that the strength 
s biased since it predicted that a substantial percentage of women would perform 
rly and almost all men would perform well at meat packing tasks. In fact, the test might 
predicted injuries but was not effective in predicting actual performance on the job. 
In contrast, fairness is a value judgment about actions or decisions based on test scores, 
y employers base hiring decisions on tests of general mental ability. Many applicants 
that in addition to (or instead of) the cognitive ability test, dependability and moti- 
should play a role. This was the view of Yvonne in the example at the beginning 
1 chapter. In the view of many applicants, the test and the method of hiring is unfair 
fcven though there may be no statistic^ bias in predictions of success. 

^urphy and Davidshofer (2005) considered fairness to be a philosophical or political 
. term and not a scientific one. They gave an example to make their point. A test of physical 
^ ‘itemgth might predict job success equally for male and female firefighter applicants, yet 
•Bniinate most of the female applicants because they have less upper body strength than males, 
r Many individuals would consider such a test unfair even though it was unbiased, because 
I it prevents women from becoming firefighters. In contrast, a biased test might be used to 
■Pfcrease the number of minorities in a particular job or company, but still be considered 
feir because it corrects for a past under-representation of those minority group members. 
iCulture is a thir d concept, separate in many respects from either fairness or bias. Culture 
' ^dresses the extent to which the test taker has had an opportunity to become familiar 
with the subject matter or processes required by a test item (Murphy & Davidshofer, 2005). 
In many tests for teacher certification, there is a component that addresses the general 
^Bltural literacy of the candidate—for example, how well he or she knows works of art 
and music, variations of modem dance, and the deeper meaning of literary passages (National 
BWuation Systems, 2002). Consider the following hypothetical test items: 


A crackberry is: 

' a) a late summer wild fruit that thrives in Northern Sweden 

b) a hybrid of a raspberry and blackberry 

c) a personal digital assistant 

- d) a person addicted to crack cocaine 

fphishing is 

a) fishing while drunk 

Ij) to criminally acquire sensitive information 

c) a method of advancing electrical wire through a conduit 

d) a hp noise intended to make another be quiet 


Bias Technical and 
statistical term that deals 
exclusively with a situation 
where a given test results 
in errors of prediction for a 
subgroup. 

Fairness Value judgment 
about acbons or decisions 
based on test scores. 

Cutture A system in which 
individuals share meanings 
and common ways of 
viewing events and objects. 





Chapter 3 Individual Differences and Assessment 


3.3 Foundations of Assessment 131 


The answers to these questions (c and b respectively) are generationally “biased.” Your 
grandmother, or maybe even your father may not know the answers to these questions, 
but chances are good that your roommate will. Would items like these be considered 
generationally “hiir”? Probably not. 

Greenfield (1997) presented examples of difficulties in “transporting” North American 
cognitive ability tests to other cultures. In one example from Cole, Gay, Click, and Sharpe 
(1971), Liberians were asked to engage in a cognitive sorting task. They were asked to sort 
20 objects into categories. According to the test developers, the objects divided evenly into 
four categories: food, food containers, clothing, and implements. Rather than sorting the 
20 objects into these four categories, however, the Liberian participants made functional 
pairings of the objects. For example, they paired a potato with a knife, reasoning that a 
knife is used to cut a potato. When they were asked why they did that, they would reply 
that this is how a “wise man” would complete the task. After repeated attempts to get them 
to use the four neat categories of items, and repeatedly getting the “wise man” response, 
the researchers asked the participants to sort the items as a “fool” would do it. The sub¬ 
jects promptly sorted the items into the four categories that the researchers preferred. As 
Greenfield noted, “the researchers’ criterion for intelligent behavior was the participants’ 
criterion for foolish, the participants’ criterion for wise behavior was the researchers’ 
criterion for stupid” (p. 116). Greenfield concluded that to use a test developed in one 
culture for another culture, there must be agreement on the value of particular responses 
to particular questions, as well as agreement that the items mean the same thing in the 
different culture. Note that neither of these requirements has anything to do with the qual¬ 
ity of the test’s linguistic translation; instead, they relate to meaning in a deeper sense. 
Sternberg (2004) has argued vigorously that inteUigence cannot be understood without 
taking into account the culture in which it is measured. He cites the example of the Taoist 
culture, in which inteUigence includes the importance of humility, freedom from conventional 
standards of judgment, and fuU knowledge of oneself, in contrast, the Confiician perspective 
emphasizes the importance and persistence of life-long learning with enthusiasm. 

A study of East Indian and American workers underscored Greenfield’s caution 
(Ghorpade, Hattrup, & Lackritz, 1999). Although the researchers found few differences 
between Indian and American men or women with respect to the measurement of the 
personality variable locus of control, there were substantial differences between Indian and 
American women in the meaning of a self-esteem measure. Indian women were much 
more likely to feel guilt over individual activities, such as seeking opportimities to suc¬ 
ceed and achieve, that might be seen by Americans as evidence of self-esteem. This was 
likely the result of differences between Indian and American women on Hofstede’s col¬ 
lectivism-individualism dimension. In addition, it is likely that American women were more 
likely to identify with the masculine end of Hofstede’s masculinity-femininity dimen¬ 
sion, favoring accomplishment rather than interpersonal relations. Thus, if Indian and 
American women were compared on self-esteem, a researcher might see the Indian 
women as having “less” esteem when, indeed, what Americans view (and admire) as self¬ 
esteem had a far less positive connotation to the Indian women. 

As Americans from different ethnic groups increasingly mingle in pubhc schools, uni¬ 
versities, other pubhc institutions, and work settings, they are becoming more famihar with 
each other’s subculture today than was the case 30 years ago. As a result, the concept of 
the cultural content in current tests is becoming less of an issue in explaining differences 
among ethnic groups. At the same time, cultural content is becoming an increasingly 
important issue in the workplace because of the growing multicultural nature of work and 
the increasing cultural diversity of applicant populations. 



ional Assessment Practices 

Jarher in the chapter, we reported research that found that tests of mental abihty 
used more commonly in Europe than in the U.S. This is just one example of the 
fences that can be found worldwide in assessment practices. Variations in global 
lent practice will become increasingly important in the next decade for both multi- 
mal employers and applicants to multinational organizations. Several reviews of 
lent in other countries help to illustrate the differences between assessment in the 
u!s. and assessment elsewhere (Bartram & Coyne, 1998; Muniz et al., 2001; Oakland, 2004; 
& van den Berg, 2003). Highlights from these reviews include the following: 

) Eiuropean psychologists would like to have a more structured role for professional 
f institutions in developing and monitoring good testing practices. In response to that 
^ expressed need, the International Test Commission developed the International 
[Guidelines for Test Use (International Test Commission, 2000). 

In industrial settings, the countries in which tests were most frequently adminis¬ 
tered by psychologists were Croatia, Bulgaria, Finland, Slovakia, Denmark, Japan, 
and Slovenia. The countries in which tests were most frequently administered by 
^iwnpsychologists included Canada, Sweden, Cyprus, Norway, Switzerland, the 
U.K., and Germany. 

The greatest amount of information about test quality could be found in the 
U.S., the Netherlands, Japan, the U.K., Canada, Spain, Finland, Belgium, and 
Slovakia; the least amount of information was available in China, Denmark, 
f Ukraine, and South Africa. 

In India and China, testing is largely unregulated; many countries are moving toward 
the certification and training of nonpsychologists who use tests (Bartram, 2005). 


In general, it would appear that the various guidelines available for evaluating tests, test 
use, and test users in the U.S. (American Educational Research Association et al. 
Stmdards, 1999; SIOP Principles, 2003; Turner et al., 2001; Uniform Guidelines, 1978) 
are ideals to which many other countries aspire. 










12 Chapters Individual Differences and Assessment 


MODULE 3.3 SUMMARY 


Employment testing was first widely used after 
the First World War and has been heavily 
influenced by the Civil Rights Act of 1964.1-O 
psychologists are interested in determining how 
effective various tests are in predicting work 
performance. They have identified many differ¬ 
ent attributes that appear to contribute to work 
performance and many different methods for 
assessing these attributes. 

The definition of a test encompasses paper and 
pencil tests, interviews, actual attempts to per¬ 
form a piece of work (a work sample test), and 
even an application blank. The definition is also 
broad enough to cover many different types of 
content, including cognitive ability, personahty, 
values, communication skills, interpersonal 
skills, and technical knowledge. 


In Module 3.2 we introduced the initiahsm 
KSAO (knowledge, skill, ability, other charac¬ 
teristics) to summarize the attributes of a 
worker. In one way or another, every test is an 
assessment of one or more of these content 
areas. 

Tests can be described or differentiated accord¬ 
ing to categories that include speed versus 
power tests, individual versus group tests, and 
paper and pencil versus performance tests. 

In discussing tests and testing, it is important to 
consider three factors: bias, or errors of predic¬ 
tion; fairness, a value judgment about decisions 
based on test scores; and culture, the extent to 
which a test taker has the opportunity to 
become familiar with the subject matter. 


KEY TERMS 


mental test Mental Measurements Yearbook paper and pencil test 

test speed test performance test 

norming power test bias 

norm group group test fairness 

test battery individual test culture 


FT* 

i ) module 3.4 


ll^nessment Procedures 




essment Content versus Process 


I Broloyers and applicants often confuse the content of testing with the process of testing. 
As we suggested earlier in this chapter, there is a difference between what attribute is being 
P^ed and how it is being assessed. For example, after applying for a job with a local 
company, an applicant might describe the process as including a personality test, a cog¬ 
nitive test, an interview, and a background check. The terms “personality” and “cogni- 
_lio*\describe the content of the assessment and the terms “interview” and “background 
M>eck” describe the process of the assessment. The reason why this content-process dis- 
Knction is important is that you will often see claims for the “validity” of the interview 
'if work sample. But the validity depends not so much on the process by which the infor- 
[mation was gathered as on the content of that information. In the sections that follow, 
we will consider information gathered in various formats, ranging from a paper and pen¬ 
cil test to an interview. But as we discussed earlier, many of these methods can be used 
l^lgather many different kinds of information. For example, an interview could assess com- 
^■Pucation skills, knowledge, ability, or personality—or, as is most often the case, a com- 
^ppation of those “content” categories. First, we wUl consider the content of assessment, 
^d then the process for gathering this content. 


ssment Procedures: Content 


Cognitive Ability Tests 

Guion (1998) defined cognitive ability tests as those which: 

-fallow a person to show what he or she knows, perceives, remembers, understands, or can 
ii, Work with mentally. They include problem identification, problem-solving tasks, perceptual 
'1. sensory) skills, the development or evaluation of ideas, and remembering what one has 
^teamed through general experience or specific training, (p. 486) 

though Guion identified what seem to be a variety of cognitive abilities (e.g., 
k ^fcP^l’ering, problem identification), as we saw earlier in this chapter, there is still a 
[ ’^^fous debate regarding whether there is only one overarching cognitive ability—“g” or 


Cognitive ability Capacity to 
reason, plan, and solve 
problems: mental ability. 












34 Chapter 3 Individual Differences and Assessment 


ft W(jndcrlic’ W P T - R Sample Questions Iwpcii 


1. Which of the following is the earliest date? 

A) Jan. 16,1898 B) Feb. 21,1889 C) Feb. 2,1898 D) Jan. 7,1898 E) Jan. 30,1889 

2. LOW is to HIGH as EASY is to _J_. 

J) SUCCESSFUL K) PURE L) TALL M) INTERESTING N) DIFFICULT 

3. A featured product from an Internet retailer generated 27,99,80,115 and 213 orders over a 5-hour period. 
Which graph below best represents this trend? 



A B C D E 


4. What is the next number in the series? 29 41 53 65 77 ? 

J) 75 K) 88 L) 89 M) 98 N) 99 

5. One word below appears in color. What is the OPPOSITE of that word? 

She gave a complex answer to the question and we all agreed with her. 

A) long B) better C) simple D) wrong E) kind 

6. Jose's monthly parking fee for April was $150; for May it was $10 more than April; and for June $40 more 
than May. His average monthly parking fee was ? for these 3 months. 

J) $66 K) $160 L) $166 M) $170 N) $200 

7. If the first two statements are true, is the final statement true? 

Sandra is responsible for ordering all office supplies. 

Notebooks are office supplies. 

Sandra is responsible for ordering notebooks. 

A) yes B) no C) uncertain 



J K L M N 


9. Which THREE of the following words have similar meanings? 

A) observable B) manifest C) hypothetical D) indefinite E) theoretical 

10. Last year, 12 out of 600 employees at a service organization were rewarded for their excellence in customer 
service, which was ? of the employees. 

J) 1% K) 2% L) 3% M) 4% N) 6% 

Answers; l.E, 2.N, 3.D, 4.L, 5.C, 6. M, 7. A, 8. KLM, 9. CDE, 10. K 

FIGURE 3.5 Wonderlic Personnel Test—Revised Sample Items 

Source: © 2007 Wonderlic. Inc. • 1795 N. Butterfield Road. Suite 200. Libertyville. IL 60048 • 800.323.3742 • www.wonderlic.com. 


3.4 Assessment Procedures 


135 



In more than a century of cognitive ability testing, there have 
tests that produce a single number intended to represent cog- 
njrive abiUty, tests of specific abilities, and test batteries that pur- 
to measure several different facets of cognitive ability. Figure 
3.5 Presents sample items from one of these tests, the Wonderlic 
ftfennel Test—Revised. 

Tests that Produce a Single Score 

An feample of a test intended to produce a single score representing 
^eral mental abihty is the Wonderlic Personnel Test. It includes 
50 items that assess verbal, numerical, and spatial abilities. Because 

e ainistration time is 12 minutes and most applicants cannot 
the test in the allotted time, the Wonderlic is considered a 
■^eed test. There are elaborate norms for the Wonderlic, making 
its interpretation relatively simple. Its ease of administration and 
goring make it a popular device for many organizations. Murphy 
and pavidshofer (2005) endorsed the use of the Wonderlic, 
l^nting to its high reliabUity and strong correlations with other, 
more ekborate, tests of intelligence. 


Which would he the heller shears for cutting metal? 



FIGURE 3.4 Sample Kern from Bennett Mechanical 
Comprehension Test 

Source: Bennett Mechanical Comprehension Test Form BB. Item Y. 
Copyright © 1942. 1967-1970. 1980.1997 hy NCS Pearson. Inc. 
Reproduced with permission. All rights reserved. "Bennett Mechanical 
Comprehension Test" is a trademark, in the US and/or other countries. 


of Pearson Education. Inc., or its affiliates). 


Tests of Specific Abilities 

As"tnplied by Guion’s definition, many 
tests concentrate on only one aspect of 
jfcgnitive ability. One such a test is the 
Pennett Test of Mechanical Comprehen- 
Won. The sample item from this test (see 
Figure 3.6) asks the test taker to examine 
Ae two different cutting instruments and 
tK^educe, from either experience or logic, 
fiiat the shears labeled B would be more 
Bfrctive at cutting metal than A. One can 
Wt^frie that such a test item might be 
well suited for choosing applicants for the 
frade position of sheet metal worker or 

Ipmnber. 

iAnother example of a specific mental 
f Aility is spatial relations. Consider the 
ttein in Figure 3.7. It requires the test 
to do some actual mental manipula- 
fron of the factory shown from the front 
“turning” the factory in his or her 
Wind and then choosing the response that 
►Jotild most closely resemble how the 
P>ctory would look from the back. This 
! Aility to manipulate objects in one’s mind 
L IS particularly useful for many hardware 
fct^air or “troubleshooting” professions, 
as an auto mechanic or computer 
^^pair technician, where it is necessary to 



Above Is a picture of a factory shown from the front 
From the back, it would look like: 



C D 


FIGURE 3.7 Spatial Relations Item from a Test for Firefighters 










)6 Chapter 3 Individual Differences and Assessment 


Cognitive test battery 
Collection of tests that 
assess a variety of 
cognitive aptitudes or 
abilities; often called 
Multiple Aptitude Test 
Batteries. 


Knowledge test Assesses 
the extent to which 
individuals understand 
course or training materiats: 
also administered Ibr 
licensing and certification 
purposes. 


visualize a component buried deep under the hood of a car or in a hard drive. There are 
many other examples of specific cognitive abilities, such as clerical and perceptual accu¬ 
racy, memory, and reasoning. Most testing texts (Anastasi & Urbina, 1997; Cohen & Swerdlik, 
2002; Guion, 1998; Murphy & Davidshofer, 2005; Thomas, 2004) provide detailed 
descriptions of these tests. Mumford, Baughman, Supinski, and Anderson (1998) presented 
a sophisticated treatment of how to measure complex cognitive abilities such as reason¬ 
ing and creative problem solving. 

Cognitive Test Batteries 

Multiple aptitude test batteries have a long history in psychological testing in industry. 
Thurstone (1938) introduced a test of Primary Mental Abilities (PMA) that assessed numer¬ 
ical ability, verbal ability, reasoning, spatial relations, perceptual speed, and memory. More 
recent examples of multiple aptitude test batteries include the Armed Services Vocational 
Aptitude Battery or ASVAB (Katz, 1987; Murphy 8c Davidshofer, 2005), the Differential 
Aptitude Test Battery or DAT (produced by the Psychological Corporation, 1973, 1974), 
and the General Aptitude Test Battery or GATB (Hartigan 8c Wigdor, 1989). The ASVAB, 
as implied by its name, is used exclusively by the armed services. The GATB is used exclu¬ 
sively by the federal government. The DAT is commercially available for employer use. 
Students are more likely to be familiar with the Scholastic Aptitude Test (SAT) or ’ 
Graduate Record Examination (GRE), both examples of cognitive test batteries. In one 
way or another, these batteries all measure verbal, numerical, spatial, and reasoning abil¬ 
ities. Although cognitive test batteries take longer to administer than a “single score” test 
like the Wonderlic, or any test of an individual facet of cognitive ability, they do have the 
advantage of providing more detailed information about particular manifestations of cog¬ 
nitive ability that may be more important in one job than another. 

Knowledge Tests 

Tests you will take in this and other courses are knowledge tests. They assess the extent 
to which you know course material. These types of tests are typically tailored to course or 
training material. Knowledge tests are also administered for licensing and certification pur¬ 
poses, including teacher certification, nuclear power plant operator licensing, and licenses 
to practice law or medicine, or to sell investments. Knowledge tests are like any other type 
of test and require the same care in development, norming, and administration. We will 
discuss non-paper and pencil forms of knowledge tests later in this chapter. i 

Tests of Physical Abilities 


As we saw earlier in the chapter, there are seven basic physical ability attributes (Hogan, 
1991a). These include static strength, explosive strength, coordination, and stamina or aer¬ 
obic endurance. While it is possible to measure each of these physical abilities in isola¬ 
tion, most physically demanding jobs actually require combinations of these abilities. As 
a result, many physical ability testing procedures tend to use simulated pieces of work to 
assess the combined abilities. For example, consider a test frequently used to assess the 
physical abihties of firefighter candidates (see Table 3.4), composed of several events 
each of which requires multiple abilities. An excellent review of physical abihties and their 
measurement appears in a study of age and physical abilities conducted for the Equal 
Employment Opportunity Commission (EEOC) and the Department of Labor (Tandy, Bland, . 
Buskirk, et al., 1992). There is substantial evidence that measures of physical abilities can 
improve the prediction of job success for many physically demanding jobs (Arnold, 


3.4 Assessment Procedures 


U Physical Ability Tests for Firefighters 


S b: Candidate wears fire protective clothing and air tank and canies seven pieces of equipment up three 
5. one piece at a time. Each piece of equipment weighs b^ween 25 and 55 pounds. 

ididate we»s air tank stands in one ^ aid pulls 50 feet of fire l»se filled vrith water using a hand- 
imique. 

andidate wears air tank and pulls a 16-foot ladder from the ladder bed of a fire buck, places it on the 
it back up. and replaces it in the ladder bed. 

Candidate drags a 125-pound sanifeag around a serpentine course of 40 feet The caididate must ke^ 
ontact with the ground aid may not lift or cany the sandbag but must drag it 

andidate wears fire protective clothing and an air tank After putting on a blackened face mask the 
^ crawl through a plywood maze that has several turns in it In addition, there are sandbags located 
tiroughout the maze. The maze is approximately 40 feet in length. 

ididate wears an air tank and attem^y puds and pushes a 75-pound weight attached to a pole hanging 
The candidate must complete as many repetitions as possible in a 4-minute period. A repetition is 
e push and two pulls. 

ididate wears fire protective clothing and an air tank and lifts a 50-pound fen from ground level hanging 
|n a standard door frame. 


dienberger, Soubel, & Guion, 1982; Campion, 1983; Hoffmann, 1999; Reilly, Zedeck, 



Psychomotor abilities 
Physical functions of 
movement associated with 
coordination, dexterity, and 
reaction time.- also called 
motor or sensorimotor 
abilities. 


& Tenopyr, 1979). Ai^ey, Landon, Nutting, and Maxwell (1992) provide a good descrip¬ 
tion of the development and validation of an entry-level physical ability examination for 
pohce officers. The caution, as we have seen earlier, is that the physical ability tests are 
roost defensible when used to predict performance rather than risk of injury. 


A ^®sts of psychomotor abilities involve the coordinated movement of the limbs in 
J^ftponse to situational factors. It may be a complex task in which the individual is 
^JWquired to move arms and legs in co- 
~ puroation, as in flying an airplane, driving a 
^ ^ducle, or playing an organ; or it may be 
‘'l*l«imple or discrete action such as firing a 
- 5**apon, pulling a lever, or administering 
■ Ejection to a patient. For some jobs, 

Jp^omotor abilities represent character¬ 
ises of the individual that have some 
^■P^ntial for contributing to successful job 
Bi^ronnance above and beyond cognit- 
•Vilifies, physical abilities, or personahty 
H^^^cteristics. Psychomotor abihties are 
assessed using a task that requires 
^^prterity, such as placing pins in slots with 
^BS^^ers, such as is depicted in Figure 3.8. 
k ^B*rman and his colleagues have developed FIGURE 3.8 The Crawforti Small Parts Dexterity Test 






Chapter 3 Individual Differences and Assessment 


some sophisticated computer-based psychomotor tests for the selection of appUcants for : 
jobs such as air traffic controllers (Ackerman & Cianciolo, 1999, 2002). 


Personality 


As we have seen earlier in the chapter, personality attributes are now widely recognized 
as contributors to job success. There are many commercially available instruments for 
measuring personality characteristics, many based on the Big 5 model 

TABLE 3.5 Some Commonly Used Personality Instruments 3.5 hste some of the more commonly used 

personality instruments. The history of personality testmg can be 

. ... ... .. ... described in two general phases. The early foundation of person- 

C^mia PsvcholMLIr^i^ testing was focused on the identification of the abnormal per- 

PersonalityRes^hFomi(PRF) sonality and evidence of possible psychopathology (i.e., mental 

Edwards Personal Preference Schedule Using personality testing for that purpose might be 

thought of as an attempt to screen out potentially problematic 
Jackson Personality Inventory-Revised UPl-R) employees. With the advent of instruments intended to provide 

16 PF quantitative descriptions of the normal (rather than abnormal) 

H 0 ).pi personality, personality testing in employment shifted to a screen 

Hogan Personatrty Inventory in process whereby employers sought to identify applicants with 

Savillfi Consulting Wave positive personality characteristics (e.g., conscientiousness, emo¬ 

tional stability, or agreeableness) that would contribute to effec¬ 
tive performance. 

As you can see. Table 3.5 is separated into two sections. The upper section includes tests 
Screen out test Used to that have been frequentiy used for purposes of identifying signs of psychopathology—screen 

eliminate candidates who out tests. The tests listed in the lower section have been more fi'equently used to identify 

variations of normal personality—screen in tests. There is an important distinction 
psychopathology are between these two different categories of tests. Tests developed or intended to identify psy- 

examples of screen out chopathology, or used commonly for that purpose, are considered “Medical Tests” under 

tests in the emptoyment the Americans with Disabilities Act (1990), particularly if the test is administered by a chn- 

ical or counseling psychologist or a psychiatrist. As such, they may not be administered 
Screen in test Used to until after an offer of employment has been made, as is the case with physical examina- 

add information about the tions, because emotional disorders are considered covered disabilities under the ADA. 

positive attributes of a Applicants might be placed at a disadvantage in the selection process if their condition 

were revealed throu^ pre-employment testing. On the other hand, tests developed or 
tests of normal personality intended to assess normal personality may be administered as pre-employment tests and 

ate examptes of screen in used for purposes of choosing among applicants prior to an offer of employment. If an 

tests in the employment employer administers a test such as the MMPI-II in order to choose among applicants 

®®*ting- prior to an offer of employment, that practice can be challenged in court and the apph- 

cant will likely win that challenge. 

There are many positions of pubhc trust (e.g., pubhc safety officers, nuclear power plant 
operators, air traffic controllers, commercial airline pilots) that warrant testing for possible 
psychopathology to guard against catastrophic pathological actions by the incumbent. But 
most job titles in industry do not directly involve the health and welfare of the pubhc, 
and testing for personahty abnormahties would be questionable in such jobs. Figure 3.9 
presents some sample items from the Saville Consulting Wave test, which is frequently 
used to assess normal personality in job apphcants. 

Practice Issues Associated with Person^ity Measures 

Up to this point, we have been dealing with the “science” of personahty. But there are 
also practical questions that arise about the measurement of personahty for making 


Screen out test Used to 
eUminate candidates win 
are clearly unsuitable for 
employment; tests of 
psychopathology are 
examples of screen out 
tests in the employment 
setting. 

Screen intest Usedto 
add informabon about the 
positive attributes of a 
candidate that might predict 
outstanding performance: 
tests of nonnal personality 
are examptes of screen in 
tests in the emptoyment 
setting. 


3.4 Assessment Procedures 


139 


baiaigeofrespoi«esvatyingfromverystteng^ agree to very ^tongfydisa^. Choose the response attemabve that best descrdies how you feel 
ImtiIs are about being good at sotnelhing aid ottiets are about wlat you prefer, need, or ate interested in. Itead ^ staternert careful^ because ttieie inay be 
bMweai what you are good at and what you may need. Try to answer the quesbons from a work persperdne as mudi as possible. 


1 

Very 

Strongly 

Oi»gree 

9rong^ 

1 Disagree 

Disagree 

SUghtly 

Disagree 

Unsure 

Syghtty 

Agree 

Agree 

Strongly 

Agree 

Very 

Strong 

Agree 

have a cter set of priorities 










^Mtai encouraging otfiers 











IflBMIE 3.9 What the Saville ConsuHing Wave Looks Like 
^liKtSavileConsutting. 

[ ^^yment decisions. Hogan, Hogan, and Roberts (1996) addressed those larger practical 
E^^ons, as summarized in Box 3.6. 

one final and controversial point about personahty tests that is not addressed directiy 
Box 3.6. Some tests, particularly some commerciaUy available integrity tests, are very 
lj|Blparent. It is obvious how one should answer the test questions in order to appear to 
high integrity. This is a bit different from a cognitive abihty test where a candidate 
nught fake “dumb” answers (which would probably mean that the person is dumb) but 

E pretend to be “smarter” than he or she actuahy is. A candidate might bear the 
g “script” in mind when answering integrity test questions: 

: never stolen anything since I was a young child, and even then, I don’t think I ever 
inything. I do not have any friends who steal, or would even think of steahng anything. 

If they did, they could not be my friends anymore and I would tell the appropriate author¬ 
ities that they had stolen something. I think that showing up for work late, not doing a com¬ 
plete job, leaving work early, and taking sick days when you are not sick is also stealing and 
ftwould not do any of those things or be friends with anyone who would. I would inform 
Jjans^ement if I ever found out that a co-worker was engaging in any of these behaviors. 

"niis “script” is only partly facetious. It is amusing in its extremity, but it makes the point 
fl>at it is possible to answer questions on a personahty-like device in a way that gets the 
^*cst result—that is, an offer of employment. But what about tests that are not so trans¬ 
calent? From a practical standpoint, there are actually three questions to answer: ( 1 ) How 
it to fake personality tests? (2) How many people do it? and (3) How much does 
tter whether people do or do not fake? Let’s take these one at a time. 

How easy is it to fake personality tests? Not difficult. As Hogan and colleagues (1996) 
ted out, some are easier to fake than others. But you can answer any personality test 
way that makes you look “good.” The real question is whether doing that truly qualifies 
feking.” From some perspectives, personality is all about self-presentation; it is your 
j^blic face, your “game face.” So to the extent that the personahty test is a paper and pen- 
form of self-presentation, it is not faking, nor is it distortion (Hogan et al., 1996; Moimt 
^Barrick, 1995). Interestingly, and siuprisingly, De Fruyt, Aluja, Garcia, Rolland, and Jung 
6 ) found httle relationship between inteUigence and the tendency to fake. 

researchers (e.g.. Young, White, & Heggestad, 2001) suggest that the way to 
alize faking is to develop forced-choice tests that require an individual to rank or 
him- or herself by considering a number of alternative positive-appearing items, 
logic is that by requiring ordering or preference, an individual cannot appear to be 











Chspt§r 3 Individual Differences and Assessment 



q: There are many personality tests and scales avail- a: By definition, personality is relatively stable 

able. How do you choose among them? over time and from one set of circumstances to 

a: Use valid and reliable tests that cover at least the another and continues to affect our lives in 

Five Factor Model dimensions. important ways. Even though behavior changes 

q; Why should you use a test that measures more occasionally, stable aspects of personality are still 

than one aspect of personality when you are effective predictors. 

interested in only one? q: Do personality measures discriminate against 

a: Because behavior usually is a function of many ethnic minorities, women, older individuals, and 

different influences, not just one. the disabled? 

q: What do personality tests measure? a: There is no evidence of discrimination against 

a: a person’s typical “style.” these groups in well-developed personality tests. 

q: Why use personality tests to make employment People over 40 tend to receive more positive 

decisions? scores than those under 40. There are some dif- 

a: Because most workers and managers use terms like ferences between males and females (men have 

“being a team player,” “remaining calm under higher scores on emotional stability and women 

pressure,” “being persistent,” and “taking initia- have higher scores on conscientiousness), but 

tive” as critical for success in almost any job. these are not significant enough to result in dif- 

q: Do personality tests predict job performance? ferent hiring decisions. 

a: Yes. q: Do personality tests invade privacy? 

q: Do personality tests predict performance in all jobs? a: Some appear to. Choose tests with the highest 

a: Probably, but they are less predictive for jobs validity and reliability, and the fewest number of 

with little autonomy. offensive-appearing questions. 

q: Weren’t personality tests developed to measure q: What is the best way to use personality measures 

psychopathology and for use in clinical settings? for pre-employment screening? 

a: Many years ago, that was true. The tests available a; In combination v«th measures of technical skills, 

today are designed to assess normal personality. experience, and the ability to learn. 

q: People’s behavior changes constantly. Doesn’t 

this invalidate personality tests? Source: Based on Hogan et al. (1996). 


superior on all of the statements. Although this technique appears to reduce uninten¬ 
tional distortion of responses to personality test items, it appears to have little effect on 
an individual who intentionally fakes a response (Heggestad, Morrison, Reeve, & McCloy, 
2006). Ellingston, Sackett, and Connelly (2007) argue that there may be no need for 
any elaborate techniques, such as forced-choice items, to reduce unintentional distortion. 
They compared results on a personahty test given to the same employee, once for selec¬ 
tion purposes within an organization (e.g., promotion) and at a different point in time 
(sometimes before and sometimes after) for purposes of employee development (for 
training, not selection). The differences were minimal, suggesting little distortion. The 
authors hasten to point out, however, that all of the participants had jobs and may 
have felt less motivated to distort than a person applying for a job with a new employer 
would. 

Some have suggested that the real issue is whether the test taker has the correct 
frame of reference (FOR) for taking the test (Lievens, De Corte, & Schollaert, 2008). As 
an example, consider being asked to take a personality test and told to use one of three 
perspectives: at school, at work, or in general (Schmit, Ryan, Stierwalt, 8c Powell, 1995). 


3.4 Assessment Procedures 


141 






es are that your personality at work differs from your personality in nonwork social 
s. As a sales representative, for example, you could be outgoing but in nonwork 
gs might be more reserved because there are fewer demands for extraversion. In a 
l^dy with customer service reps for an airline, Himthausen, Truxillo, Bauer, and Hammer 
2003) found that specifically instructing employees to adopt an “at work” FOR increased 
idity of personality test scores. If these findings are replicated in other studies, it may 
n effect on both research and practice. In the research context, it may mean that 
of the reported validities of personality tests for predicting performance may sub- 
lUy underestimate those values because a FOR other than “at work” may have been 
ipted by the test takers. In practice, this finding suggests that when personality tests are 
istered in a selection context, the respondent should expUcitly be told to adopt an 
at work” FOR. 

cHow many people fake personality measures? It is hard to know (Mount 8c Barrick, 
1995) because the prevalence depends, as we have seen in the preceding paragraph, on 
how you define faking. Some studies say the rate of faking is substantial, whereas others 
say it is minimal. The main evidence to suggest that faking may be occurring is that appli¬ 
cant groups often have significantly more positive scores on given personality measures 
dian employed groups (Bass, 1957; Kirchner, Dunnette, 8c Mousely, 1960; Weekley, 
«»yhart, 8c Harold, 2004), and, not surprisingly, the tendency seems to be greater 
^^■mong American than non-American applicants (Sandal 8c Endresen, 2002). In addition, 
l^K^jlphisticated statistical analyses of responses to personality questionnaires (Michaelis 8c 
Eysenck, 1971; Schmit 8c Ryan, 1993) show that there are different patterns of responses 
from applicants than from employees or students. Birkeland, Manson, Kisamore, Brannick, 
and Smith (2006) find that the pattern of faking corresponds to what an applicant might 
IS are the most important characteristics of the job in question. Generally, applicants 
^ ^ ivpd substantially high scores on emotional stability and conscientiousness than non- 
licants; the positive differences for extraversion and openness to experience were 

'^^^^UiThis brings us to a third question: How much does it matter? The answer is that it 
Idoes not appear to matter much. In studies where participants were instructed to 
^^■rtstort their responses to make themselves look good, the predictive validity of the per- 
ahty measures remained the same (Hough, Eaton, Dunnette, Kamp, 8c McCloy, 
990). And if we return to the self-presentation view of personality, “distortion” could 
^ Wther increase or decrease the vahdity of the personality measures. If the job in question 
wn Were a sales position, some have suggested that a desire to look “good” in the eyes of 
^H^Bother might actually be a job-related attribute (Hogan et al., 1996). A meta-analysis 
<Viswesvaran, Ones, 8c Hou^, 2001) seems to effectively rebut that hypothesis, at least 
for managers. There was essentially a zero correlation between a test taker’s desire to look 
d” and his or her supervisory ratings on interpersonal skills. On the other hand, if 
individual is having a performance counseling discussion with a supervisor, a more 
istic presentation of strengths and weaknesses by the individual would be more 
rtive than trying to look good. The issue of faking is not “settled” yet (Mueller-Hanson, 
^^■IP^estad, 8c Thornton, 2003), but there does seem to be some agreement that it is 
not a fatal flaw in personality testing (Hough, 1998; Hough 8c Ones, 2001; Ones 8c 
Viswesvaran, 1998; Salgado, Viswesvaran, 8c Ones, 2001; Viswesvaran 8c Ones, 1999; 
Weekley et al., 2004). 

There is one additional cautionary note of some practical significance for test takers inclined 
to intentionally distort their responses. Most personality tests have a “he” scale, which 
"ttdicates if a person is trying to make himself or herself look “ideal” in some way. The 
tost report for an individual will usually include a cautionary note indicating a lack of 
idence in the resulting scores if the apphcant scored too high on the he scale. In 
*tion, there is some research (Dwight 8c Donovan, 2003) that indicates that if an 




•2 


Chapter 3 Individual Differences and Assessment 


3.4 Assessment Procedures 


143 


Overt integrity test Asks 
questions directly about 
past honesty behavior 
(steaUng.etc.)aswenas 


individual test taker is warned that (1) faking can be identified, and (2) faking will have 
negative consequences in terms of being selected for a position, the test taker will be less 
likely to fake. 

Integrity Testing 

Until recently, integrity testing meant honesty testing. Employers have always been con¬ 
cerned with dishonest employees. We will consider counterproductive employee behavior 
in depth in Chapters 4 and 10, but for now, note that employee theft can make the 
difference between profitability and failure for an organization. Employers are often 
vigorous in investigating incidents of employee dishonesty after the fact. Money or prod¬ 
uct is disappearing—who is taking it? But honesty and integrity tests were developed 
to predict who might act dishonestly in the future rather than who is actually responsible 
for a counterproductive act. 

Although honesty and integrity tests have been around for more than 50 years (Ash, 
1976), there has been more enthusiasm for them in the past 15 to 20 years for several 
reasons. The first reason is economic: More and more employers are concerned about the 
high cost of dishonest employees, and integrity tests are relatively inexpensive. In addi¬ 
tion, from the I-O perspective various meta-analyses have demonstrated the predictive power 
of such tests. Finally, the polygraph legislation passed in 1988 radically reduced the use of 
the polygraph for pre-employment honesty screening, making paper and pencil tests more 
attractive, particularly those shown to be valid for predicting important work behaviors 
such as theft and absence. In jobs where polygraphs are permitted, integrity tests are con¬ 
siderably cheaper than extensive background checks or polygraph tests. 

There are two different types of integrity tests: overt and personality based. The overt 
integrity test asks questions directly about past honesty behavior (stealing, etc.) as well 
as attitudes toward various behaviors such as employee theft. The 
TABLE 3.6 Examples of Overt and Covert Integrity Test Items personality-based integrity test measures honesty and integrity 
with less direct questions dealing with broader constructs such 
as conscientiousness, reliability, and social responsibility and 
awareness. Examples of both types of items are presented in 
Table 3.6. 

There have been many extensive and high-quality reviews of 
integrity test research, and these reviews have concluded that those 
who score poorly will be poorer employees for any number of 
different reasons. They may be more likely to lie or steal, be absent, 
or engage in other counterproductive behaviors (Murphy & 
Davidshofer, 2005; Ones et d., 1993; Sackett & Wanek, 1996). 
In the abstract, this sounds promising, but in the concrete, 
there are some problems with integrity tests. Murphy and 
Davidshofer (2005) summarized these concerns as follows; 


behaviors such as employee 
theft. 

PersonaUly-based integrity 
test Test that infers hone^ 
and integrity from questions 
dealing vrith broad 


There is nothing wrong with telling a lie if no one suffers 
any harm (Tnie or False?) 

How often have you arrived at work under the influence of 
alcohol? 

Do your friends ever steal from their employers? 

Covert or Per$eRa%-bffied Hems 
Do you like ydng risks? 

Would your frienrfe describe you as impulsive? 

Would you consider challenging mi authority figure? 


CE: Spector (2000). 


1. It is difficult to know exactly what any given test of integrity measures. For exam¬ 
ple, taking a long lunch hour may be considered “theft” (of time) on one test and 
not even mentioned in another test. A study by Wanek, Sackett, and Ones (2003) 
suggests that there are four basic components to “integrity tests” in general, but all 
four do not appear in any one test. These components are antisocial behavior 
(e.g., driving violations, theft admissions), socialization (e.g., emotional stability, 
extraversion), positive outlook (e.g., safe behavior, acceptance of honesty norms), 
and orderliness/diligence. 

2. Unlike ability or even personality tests, applicants are seldom informed of their scores 
or the results of an integrity test. This is particularly disturbing to a candidate 


who has been rejected for a position and can’t find out why. Nor are applicants 
|*typically warned of the risks and consequences of even taking the test in the 
fost place, raising an ethical issue of informed consent. However, any applicant who 
^ refused to take the test would naturally be considered to have withdrawn his or her 
fo*" employment, so it is not clear what the practical value of inform- 
j[ ing applicants might be. 

j* Often, integrity test scores are reported in a pass-fail or, more commonly, a 
' llecommended-not recommended format. As we will see in Chapter 6, the setting 
of pass-fail scores is very technical, and it is not clear that the test publishers take 
these technical issues into account. That raises the possibility of false negatives— 
the possibility that an individual would be erroneously rejected as a “risk.” 

ido (1998b) made an additional point about integrity as a concept. Many employers 
It publishers treat honesty as a trait, much like intelligence. But it is much easier 
a person to “go straight,” by behaving more honestly and morally, than it is for a per¬ 
il with lower general mental ability to “go smart.” Yet organizations treat an honesty 
Pltegrity score like a cognitive ability score: If a person gives honest answers to overt 
Sstions about past indiscretions, he or she may be rejected even though he or she may 
v teformed. Ironically, the only way for the reformed individual to pass the test might 
0 lie! 

11 recall that we discussed the concept of integrity in the section on the FFM of 
dity earlier in this chapter. Some argue for a “narrow bandwidth” (e.g., separate 
s for separate dimensions such as conscientiousness or emotional stability), and 
s argue for a wider bandwidth, which would involve developing a complex test to 
a complex trait. Integrity is a perfect example of this debate. One might approach 
easurement of integrity by using a “broad bandwidth instrument” such as an 
;rity test, or inferring integrity from the combination of scores on conscientiousness, 
ibleness, and emotional stability. Although this debate is largely theoretical, it also 
k has jpactical imphcations. If an employer wants to assess the integrity of an applicant, 
t Wat is the best way to do so? On the one hand, there is the ease of administering an instru¬ 
ment to get right at integrity—the dedicated integrity test—rather than combining 
’ scores from three different dimensions of a broader personality test, such as the NEO-PI. 
tt the other hand, much more is known about the meaning of any of the FFM dimen- 
s than the typical score on an integrity test. In addition, the information gathered 
ing a traditional FFM instrmnent can be used for predicting many behaviors beyond 
esty. 

iat, then, is the employer to do? A meta-analysis by Ones and Viswesvaran (2001) 
_)ared personality tests with integrity tests for predicting various work outcomes and 
iviors. The results were compelling. Integrity tests did much better (r = -(-.41) than 
4 personality tests at predicting overall job performance (r = -I-.23), but FFM-based 
tests did much better (r = -t-.51) than integrity tests (r = +.32) at predicting counter- 
furtive work behaviors (e.g., theft, violence). In a more recent meta-analysis. Ones, 
»varan, and Schmidt (2003) found a strong relationship between personality-based 
ares of integrity and absenteeism. To the extent that absenteeism can be construed 
anterproductive work behavior (as we suggest in the next chapter), then FFM per- 
ility tests might be more useful than individual integrity tests. On the other hand, if 
All work performance is most important for an organization, then an integrity test 
It be a better choice. Optimally, the employer might cover all bases and use both types 
Sts. One area of increasing relevance is the cross-cultural meaning of integrity and its 
e as a predictor. Berry, Sackett, and Weimann (2007) note that there is virtually no 
^ research, so integrity remains a variable of largely American origin and value. This 
^ a problem in an increasingly multinational workplace. 





Chapter 3 Individual Differences and Assessment 


Emotional Intelligence 


Emotional intelligence (El) 

A proposed kind of 
inteltigence focused on 
people’s awareness of their 
own and othero' emotions. 

Emotional intelligence 
quotient (EQ) Parallels the 
notion of intelligence 
quotient (IQ); a score on a 
test of emotional 
inteltigence. 


As we saw earlier in the chapter, the concept of emotional intelligence (El) has achieved 
some notoriety with the pubhc as well as arousing a great deal of interest among psychologists. 
As there is no general agreement on the definition of El, there can be no agreement 
on how to measure it. Recall also that Davies and colleagues (1998) found little evidence 
for the reliability or validity of existing El tests. A score on a test of El is often called 
an emotional intelligence quotient, or EQ, to parallel the notion of IQ. As an example, 
Multi-Health Systems, Inc. (MHS) is marketing an array of products related to El and 
EQ, including the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT™), a scale 
for measuring organizational emotional intelligence, a 360-degree measure of emotional 
intelligence, an emotional intelligence interview protocol, a youth version of the emotional 
intelligence test to be used with children between the ages of 7 and 18, and a series of 
books and videotapes intended to help people more fully develop their emotional intelli¬ 
gence. Table 3.7 presents a sample item from a popular test of El. Conte (2005) presents 
an excellent review of the available measures as well as their strengths and weaknesses. 
Unfortunately, he finds more weaknesses than strengths. The best that might be said at 
this stage is that the advocates of El have yet to make a convincing case for either the con¬ 
struct or its measurement (Murphy, 2006). Additionally, El researchers seem to be ignor¬ 
ing a substantial body of historical research on social intelligence that is also discouraging 
(Tandy, 2006). 

The scientific debate about meaning or value of emotional intelligence has not slowed 
the pace of research by applied psychologists. It appears that as the criterion of interest 
becomes more narrow, the effectiveness of El predictors increases. As an example. Rode 
and colleagues (2007) examined public speaking and group effectiveness among under¬ 
graduates and found that a combination of El and a motivational measure predicted speak¬ 
ing and group effectiveness. The El measure was an ability rather than a personality 
characteristic. 

In 1966 Marvin Dunnette wrote “Fads, Fashions, and Folderol,” a sobering piece about 
research, theory, and practice in I-Q. Fads were defined as “practices and concepts char¬ 
acterized by capriciousness and intense but short-lived interest” (p. 343). As data accu¬ 
mulate, emotional intelligence may very well prove to be a useful addition to the testing 
toolbox, but to avoid the graveyard of the “fads,” more concerted efforts of assessing emo¬ 
tional intelligence will be required. 


TABLE 3.7 A Sample Item from the MSCEIT™ 

SECTION H 

1. Km ami Andy have been good friends for over 10 years. Recenlty. however. Andy was promoted and becane Ken’s 
manager. Ken fett that the new promotion had changed Andy in that Andy had become veiy bossy to him. How effective 
would Ken be in maintaining a good relationship, if he chose to respond in each of the following ways? 

Response 1: Ken tried to understand Andy’s new role and tried to adjust to the changes in their interactions, 
a Very ineffective b. Somewhat ineffective c. Neutral d. Somewhat effective e. Veiy effective 

Response i Ken ^preached Andy and confronted him regarding the change « his behavior. 
aVeryineifective b. Somewhrt ineffective c.Meidrd d. Somewhat effective a V»y effective 

Source: Adapted from MSCEIF“ Copyright © 1999.2000.2002 Mutd-Heatih Systems. Inc. www.mhs.ci)m. All rights reserved. Reproduced 
with permission. 


3.4 Assessment Procedures 


ividual Assessment 


I iesign, most paper and pencil tests are intended to be administered to large groups. 

lividual assessment, however, only one candidate (or a very few) will be assessed 
’ different attributes. To select a CEO for a Fortune 500 company, for example, 
five recruiting firm may be retained to create a short list of three to five candid- 
I will then undergo intensive assessment. This assessment often includes paper 
:il tests, but they are administered and scored individually and may be used for 
a profile of a candidate rather than comparing one candidate with another. Because 
t populations are usually upper-level executives in an organization, individual assess- 
sometimes referred to as executive or senior leader assessment (Howard, 2001). 

1 frequently used for selection, individual assessment can also be used to identify 
needs or to provide career counseling or performance feedback to key organiza- 
embers. Because it is time intensive and requires skilled assessors, it is expensive 
kely to be used for any other than key positions in the company, 
dual assessment is complex, involving a wide variety of content areas as well as a 
iety of assessment processes. The tools most frequently used include various inter- 
sessment tools rather than paper and pencil tests. A primary reason for this is that 
re of the position is usually so complex that no paper and pencil test would, by 
ovide sufficient information. Although more than one candidate may be under- 
sessment, each candidate is usually assessed in isolation from the others. ’This allows 
thefrganization to keep the identity of candidates a closely held secret for the protection 
of the reputation of both the company (should a chosen candidate reject an offer) and 
^feKandidate (should the organization ultimately reject a candidate). 
m^e “typical” individual assessment is likely to include ability tests, personality tests, a 
■ersonal history statement, and interviews. It may also include simulation exercises or work 
fcnples, and less frequently, a clinically based personality test such as the Rorschach Inkblot 
Kest or the Thematic Apperception Test (TAT). There is not much scientific support for 
fthe use of these clinically oriented tests, but they are still occasionally used. 

^ Although we will not cover individual assessment beyond this description, Silzer and 
t Ptaneret (1998) have provided rich detail on the typical process and content of indi- 
t’^dual assessment for the interested reader, and Highhouse (2002) has presented a history 
FJlPbdividual assessment that incorporates a more critical evaluation of the role of indi- 
ll,.¥idual assessment in I-O psychology. 


Individual assessment 
Situation in which only one 
candidate (Of a very few) is 
assessed on many different 
attributes. 


bfbrviews 


i one form or another, an interview plays a role in virtually every selection or promo- 
> Aion decision. ’This has been true for many decades; one of the first texts dealing with employ- 
[ 'JOent interviewing was written by Bingham and Moore in 1931. Over the years, there have 
t ^n many fine texts (e.g., Webster, 1982) and reviews of the research on the interview 
Guion, 1998; Tandy, 1989; McDaniel, Whetzel, Schmidt, & Maurer, 1994; Posthuma, 
Bfcrgeson, & Campion, 2002; Salgado et al., 2001). 

Werview Content 

^P ferview content is often dictated by the amount of structure in the interview. A struc- 
^pfcd interview consists of very specific questions asked of each candidate, often anchored 


Stroctured inteniew 
l^jnsisls of vmy specific 
questions asked of each 
candidate: includes tightly 
crafted scoring schemes 
with detailed outlines (or 
the interviewer wHh respect 
to assigning ratings or 
scores based on interview 
performance. 




Chapter 3 Individual Differences and Assessment 


3.4 Assessment Procedures 


Asks 

the Interviewee to describe 
in specific and behavioral 
detail how he or she would 
respond to a hypothetical 


Unstructured interview 
Includes questions that may 
vary by candidate and that 
allow tire candidate to 
answer in any form he or 
she may prefer. 


in asking the interviewee to describe in specific and behavioral detail how he or she would 
respond to a hypothetical situation. This has been labeled the situational interview, a 
subcategory of the structured interview. In addition, structured interviews typically have 
tightly crafted scoring schemes with detailed outlines for the interviewer with respect to 
assigning ratings or scores based on interview performance. The situational interview can 
be contrasted with another form of structured interview known as the behavior descrip¬ 
tion interview. The basic difference between them is a time orientation. The situational 
interview asks the applicant what he or she would do, whereas the behavior description 
interview asks the applicant what he or she did do in the past. Recent research seems 
to favor the behavior description format (Taylor & Small, 2002), particularly when the 
interviews are being used to fill very high-level executive positions (Huffcutt, Weekley, 
Wiesner, DeGroot, & Jones, 2001; Krajewski, Coffin, McCarthy, Rothstein, & Johnston, 
2006). Huffcutt, Conway, Roth, and IQehe (2004) also found that as the complexity of 
a job increases, the value of situational interviews decreases, although this is simply a find¬ 
ing and not an explanation of why this might be the case. Huffcutt, Weekley, Wiesner, 
DeGroot, and Jones speculate that the prevalence of the behavior description format at 
higher levels may be because the behavior description interview allows for a greater 
influence fi'om verbal/presentation skills than the situational interview. Day and Carroll 
(2003) suggest another possible explanation: The behavior description interview assesses 
experience to a greater degree than abilities or personal characteristics. It is also likely that 
as one moves up the organizational (and complexity) ladder, experience trumps ability or 
personality. 

An unstructured interview has much broader questions that may vary by candidate and 
allow the candidate to answer in any form he or she may prefer. In addition, unstructured 
interviews are more likely to have less detailed scoring formats, allowing greater dis¬ 
cretion by the interviewer for scoring. An example of structured interview questions is 
presented in Table 3.8. The questions were developed to elicit behavioral skills from can¬ 
didates for 911 emergency dispatcher positions. 

For the most part, interviews cover one or more of the following content areas: job know¬ 
ledge, abilities, skills, personality, and person-organization fit (Huffcutt, Conway, Roth, 
& Stone, 2001). Huffcutt and colleagues found that the most frequently assessed constructs 
in interviews were personality and applied social skills, followed by cognitive ability, job 
knowledge, and skills. Salgado and Moscoso (2002) provided more detail on content. In 
a meta-analysis of the employment interview, they found interesting content differences 
between conventional interviews and tightly structured behavioral interviews. They dis¬ 
covered that the less structured or conventional interview seems to be more closely asso¬ 
ciated with personality and social/communication skills. On the other hand, the tightly 
structured behavioral interview is more closely associated with job knowledge and tech¬ 
nical attributes, and, to a much lesser extent, personality characteristics. Similar results 
have been reported by Huffcutt and colleagues (2001). 

These results take on more meaning when considered in the context of reviews of the 
validity of the interview. It has been generally found (McDaniel et al., 1994) that the high¬ 
est validity coefficients are associated vwth structured and behavioral interviews (often in 
the range of +.60) compared to the more personality-based interviews, which have valid¬ 
ity coefficients more often in the range of +.30. These results would seem to be a strong 
recommendation for tightly structured interviews based on task-based job demands over 
interviews intended to assess personahty characteristics or personal style. But a note of 
caution should be sounded here. Many of the studies on which these meta-analyses were 
based were conducted in an earlier time, before the emergence of team environments and 
client-centered work. As a result, many of the criteria used in the validation studies were 
task based. It is not surprising, then, that lower validity coefficients would be observed for 


LE18 Examples of Structured Interview Questions and the Real-Life Incidents that Are the Foundation for 
H Questions 


questions were used to interview applicants for emergency telephone operator positions. 

jfBlismrioH 

CmiCALINCim 

ngine that you tried to help a stranger, for example, 
i tafiic directions or to get up after a fall and that 
■n Uamed you for their misfortune or yelled at you. 
t would you respond? 

1. Telephone operator tries to verify address information 
for an ambulance calL The caller yells at them for being 
stupid and slow. The operator quietly assures the caller 
an ambulance is on the way and that she is merely 
reaffirming the address. 

inse a friend calls you and is extremely upset 
her child has been injured. She begins to 

Ub a hysterical manner, all about her difficulty 
^bab^rs. what the child is wearing, what 
^ child can speak, and so on. What woidd you do? 

2. A caller is hysterical because her infant is dead. 

She yells incoherently about the incident The operator 
talks in a dear calm voice and manages to secure the 
woman's address, dispatches the calL and then tries to 
secure more information drout the child's status. 

iwr would you react if you were a salesclerk, waitress. 
IBS station attendant and one of your customers talked 
k to you. indicated you should have known something 
id not or told you that you were not waiting on them 
janough? 

3. A clearly angry caller calls for the third time in an 
hour complaining about the 911 service because no one 
has arrived to investigate a busted water pipe. The 
operator tells the caller to go to_and hangs up. 


>, Statt Schneiders Schmitt (1986). 


tf&rviews centered on personality characteristics. These “personality-based’ interviews 
were also done in a time when few sound personality tests were available. Schmidt and 
^merman (2004) present some intriguing findings that seem to demonstrate that when 
« or four independent unstructured interviews are combined, the validity for that unstruc- 
d combination is as high as the validity for a structured interview conducted by a 
t' angle individual. This is good news and bad news. The good news is that certain admin- 
pjj^tive steps can be taken to increase the validity of unstructured interviews. The bad 
,i news is that it migh t be necessary to conduct three or four independent interviews to accom- 
h that increase, thus increasing the time and money the interview process requires, 
n the context of the current state of the field, it might be reasonable to use psycho- 
: devices (e.g., the NEO-PI, the Hogan Personahty Inventory, or the Saville 
ulting Wave) to assess personality attributes and the structured behavioral interview 
is knowledge and skills. Guion (1998) concluded that the structured interview is a 
ible tool in the assessment toolbag. We agree, 
idoxically, however, it appears as if managers may not agree. They tend to prefer 
hired to structured interviews (van der Zee, Bakker, & Bakker, 2002). Lievens and 
► Paepe (2004) have shed some light on this paradox. It appears that managers avoid 
ture because they feel that it makes the process too impersonal; they want more con- 
over the interview questions and process. Some recent research (Chapman & Zwieg, 
^5) indicates that applicants agree, preferring less to more structure, seeing structured 
views as more “difficult.” Lievens and De Paepe also found that those managers with 
1 training in interviewing (e.g., through workshops) were much more likely to 
^pose more structure on the format. 


147 


Chapter 3 Individual Differences and Assessment 


Interview Process 

Independent of the actual content of the interview, there are many relevant process issues. 
How should interviews be conducted? How should interviewers be trained? What are some I 
potential sources of bias in interviews? Table 3.9 presents information on many of these ! 

practical issues. Studies (e.g., Huffcutt & Roth, 1998; Latham & | 
Skarlicki, 1996; Sacco, Scheu, Ryan, & Schmitt, 2003) appear to ' 
confirm, at least on a prehminary basis, that little adverse impact ! 
is associated with the structured interview, particularly when 
compared with more traditional paper and pencil tests of cogni¬ 
tive ability. Nevertheless, these studies have examined traditional 
domestic demographic characteristics such as race and gender. As 
applicant populations become increasing multicultural, the issues 
of bias in the interview may re-emerge due to the more dramatic 
cultural differences that may appear in applicant responses. For 
example, many Asian cultures value modesty in self-presentation. 
Thus, Asian applicants may be less comfortable than American appli¬ 
cants in extolling their virtues when asked by an interviewer to 
describe strengths and weaknesses (a common question in un¬ 
structured interviews). 

Interviewees may have a completely different set of anxieties and 
priorities when it comes to the interview. McCarthy and Gofiin (2004) 
have identified five aspects of the interview that may be associated 
with the “weak knees and sweaty palms” of the applicant. These i 
anxieties revolve around communication, appearance, social skills, i 
Source: Based on Landy (1989); Huffcutt & Woelir (1999). interview performance, and behavioral control (i.e., observable ten¬ 

sion in the applicant). Anyone who has been an apphcant has expe¬ 
rienced one or more of these “anxieties.” But after years of experience, the first author of 
this text can assure you that exactly the same anxieties afflict the untrained or inexperi¬ 
enced interviewer. Recendy, Maurer and Solamon (2006) have presented a treasure trove 
of recommendations for interview preparation and performance in the form of a coaching 
program to help prepare candidates for Atlanta fire department and police department 1 
promotions. Many of their recommendations apply to preparation for any interview. 

Assessment Centers I 


Even though the word “center” evokes an image of a physical place, assessment centers 
are collections of procedures for evaluation, no matter where these procedures are carried 
out. Assessment centers are very much like the individual assessment procedure we 
described earlier, except they are administered to groups of individuals rather than single 
individuals, and the assessments are typically done by multiple assessors rather than a 
single assessor. Assessment centers have a long and successful history, and there are many 
good books and articles describing variations on the technique (Bray, Campbell, & Grant, 
1974; Finkle, 1976; Guion, 1998; Spychalski, Quinones, Gauger, & Pohley, 1997; Thornton 
& Byham, 1982). A recent book illustrates the global value of the assessment center methodo¬ 
logy, shoAving appHcations in settings outside of the U.S. (Krause & Thornton, 2008). In 
earlier years, there were as many variations of assessment centers as there were users. For 
this reason, a task force published Guidelines and Ethical Considerations for Assessment Center 
Operations (Task Force on Assessment Center Guidelines, 1989). These guidelines have 



Collection of procedures for 
evaluation that is 
administered to groups of 
individuals: assessments 
»e typically performed by 
muttipte assessors. 


TABLE 3.9 Potential Influences on Employment Inteniiews 

Nature of the Information: negative versus positive 
Placement of Information: early or late in the interview 
Presence of Interviewer Stereotypes (e.g.. Ideal 
Candidate) 

Interviewer Knowledge of the Job in Question 
Method used by Interviewer to Combine Information 
Nonverbal Behavior of Candidate: posture, gestures 
Altitudinal or Racial Similarity of Candidate and 
Interviewer 

Gender Similarity of Candidate and Interviewer 
Quality of Competing Candidates 
Interviewer Experience 
Applicant Physical Appearance 
Attention to Factual Detail by Interviewer 
Extent to Which Interview Is Structured 
Note Taking by Interviewer 
Use of Same Interviewer(s) for All Candidates 


3.4 Assessment Procedures 


149 


jjone much to standardize the assessment center process and protect the rights of those 
being assessed. 

Most assessment centers share the following characteristics (Finkle, 1976): 

1 iOLssessment is done in groups. A typical group size is 12, although smaller subgroups 
may be formed for specific exercises. The group format provides opportunity for 
peer evaluation. 

2. pAssessment is done by groups. Unlike the usual evaluators in individual assessment, 
^assessment center evaluators are usually managers chosen from the organization 

but unfamiliar with the candidates. 

3. 5 Multiple methods of assessment are employed. Like individual assessment, these 
■ might include paper and pencil tests, group exercises, interviews, and clinical test- 

■ ing. A typical group exercise might be a leaderless group discussion that is observed 
and rated by the assessors. An individual exercise might be an in-basket exercise in 
^ which a candidate is presented with the contents of a typical in-basket and asked 
. to deal with each element in the basket by making a phone call, sending an e-mail, 
> writing a memo, or starting a file for information. 

'4. f Assessment centers invariably have a “feel” of relevance to them, both for assessors 
Md for those being assessed. They are seen as much more “real” than interviews, 

I paper and pencil tests, or even isolated work simulations. 

As in the individual assessment procedure, the results of the assessment center may 
include a report, recommendation, and feedback to the participants. An excerpt from 
ai^ical report appears in Table 3.10. On the basis of assessment center results, the 
^l^iization may make one or more of the following decisions (Finkle, 1976); 

1. An assessee may or may not qualify for a given job or job level. 

2. ^'Assessees may be ranked on a series of attributes and placed into different cat- 
) egories representing anticipated speed of promotion (e.g., fast track versus normal 
progression groups). 

3. ^Predictions of long-range potential may be made for one or more of the assessees. 
4t iDevelopment and learning experiences for aiding the assessee in personal or pro- 

Vfessional growth might be recommended. 

table 3.10 Portion of a Report Based on Assessment Center Evaluation 

fe were several indications from his behavior that his strong desire to make a favorable impression promoted 
•werage tenseness in the assessment situation. On several occasions, his behavior was characterized by 
•sness and controlled quietness, as though he were reluctant to enter into a situabon until he felt absolutely sure 


he created was that (rf a young man Kiger to coqttrate. comply, and do his bed in orda-to fulfill the 

Bbtions otiwrs had for him. 

respects, the trainee’s general diilities compare favoraMy with the total sample of men in the Management 
WpBss study. 


•einbers of the staff anticiiHted a very successM caeer in the Bell System for the trainee.... There was a m 
pofdBareanent ctmrerning Ifffi ^d witii vtech he is tikely to readi tire district level of maraganetiL 
^ agreed that he presaitly displays the diitities and (wtential to perform effectively at tire ^strict leveL 


Bray etaL (1974). 





Chapter 3 Individual Differences and Assessment 


There is general agreement that assessment centers can be valuable procedures for 
selection, promotion, and training needs analysis (Arthur, Day, Mcnelly, & Edens, 2003; 
Bartram, 2002; Hunter & Hunter, 1984; Schmitt, Gooding, Noe, & Kirch, 1984; Hermelin, 
Lievens, 8c Robertson, 2007). There is less agreement with respect to why they work (BQimoski 
8c Brickner, 1987; Lance, 2008; Lance et al., 2000; Landy, 1989; Sackett 8c Tuzinski, 2001). 
Although the “why” question may be an interesting one for scientific and research 
purposes (e.g., Bowler 8c Woehr, 2006), it is less important and more mind-numbing 
from a practical perspective. Assessment centers include many different types of exercises 
and assess many different attributes. The information is eventually combined to yield a 
decision or recommendation that will be as good or as poor as the information that went 
into it. 

Decomposing the assessment center into its constituent elements and asking which 
part makes the greatest contribution is like decomposing a bouillabaisse and asking 
which ingredient made it taste so good. Nevertheless, I-O researchers cannot resist the temp¬ 
tation to decompose. And the temptation seems to be yielding informative results. The 
rationale of the assessment center is to provide opportunities for candidates to display 
effective performance in some tightly constructed simulated environments. But it is 
appearing more likely that it is not the performance that strikes the assessors, but under¬ 
lying abilities and personality characteristics illuminated by those simulated environ¬ 
ments. Overall assessment ratings appear to be closely associated with assessee cognitive 
ability (CoUins et al., 2003; Hoeft 8c Schuler, 2001) and, to a substantial but lesser extent, 
to assessee personality characteristics, particularly extraversion and emotional stability (Collins 
et al., 2003; Hoeft 8c Schuler, 2001; Lievens, De Fruyt, 8c van Dam, 2001). These results 
would seem to indicate that the combination of a good cognitive ability test and person¬ 
ality test might do as well as, if not better than (and at considerably lesser expense), a 
full-blown assessment center. Box 3.7 provides a familiar example of an assessment 
center. 

One intriguing finding from a German study (Schuler, Moser, 8c Funke, 1994) was that 
assessment center results are much more predictive when the assessors have known the 
candidates for more than two years than when they have known them for less than two 
years. This suggests that assessors may be considering much more than the results of the 
assessment exercises in making evaluations, most likely past observations of the candidate’s 
performance. At the very least, this study brings into question the practice of choosing 
assessors who are unfamiliar with the candidates. 

Assessment centers can be expensive and time consuming. They are likely to be of great¬ 
est value to large organizations that favor internal movement and promotions and invest 
heavily in the learning and development of their members. In addition, candidates who 
are evaluated through assessment centers are often very enthusiastic about the process, 
and this enthusiasm likely translates into acceptance of feedback. This can be particularly 
important when the goal is employee development rather than employee selection. 
Nevertheless, many organizations can accomplish assessment more effectively with more 
traditional assessment procedures. 

On a less optimistic note, however, blacks and Hispanics appear to do less well in assess¬ 
ment centers than whites, although women tend to receive higher scores than men (Dean, 
Roth, 8c Bobko, 2008). This finding warrants more extensive repUcation since assessment 
centers have often been adopted by employers to reduce adverse impact against ethnic minor¬ 
ity applicants. Researchers (e.g., Cascio 8c Aguinis, 2005) often express the expectation that 
assessment centers are likely to have less or no adverse impact against ethnic minority appli¬ 
cants. It may be that the connection between assessment center ratings and cognitive abil¬ 
ity that we noted above is responsible for this effect, but we need to more seriously examine 
the possible adverse impact of assessment centers not only with respect to ethnic subgroups, 
but also with respect to age subgroups. 


3.4 Assessment Procedures 


151 


Rny 5 


BOX 3.7 THE APPRENTICE ^ ^ ^ . 


Perhaps the most vivid popular version of the 
ment center is the reality television show The 
tice, which we discussed from another angle 
Qiapter 1. This show pits a number of candid- 
fates against one another to see who will become an 
rentice” for a year with real estate tycoon 
nald Trump. The candidates are assigned to 
teams, whose composition changes as one candidate 
is fired each week. Within the candidate pool, 
the situation is largely leaderless at the outset, 
more traditional 



U Trump presides over a group of would-be apprentices. 


assessment centers, a leader eventually emerges 
from the pack. In The Apprentice, the implicit 
leader of the assessor pack is easy to pick out by sim¬ 
ply noting hair style (although efforts are made to 
identify multiple assessors). The candidates are 
asked to work in teams with (and often expected 
to work against) each other on practical tasks 
including design, manufacturing, marketing, dis¬ 
tribution, and sales. Then there is the “board 
room” confrontation between and among assessors 
and assessees. This crucible often produces mem¬ 
orable moments. For example, in season two, one 
candidate who offered to give up his “immunity” 
from being fired is fired by “The Donald” for being 
“stupid, impulsive, and life threatening” (Stanley, 
2004). The feel of relevance of the situation, at 
least to candidates and the viewing public, is 
apparent (hence, perhaps, the term “reality” 
show). As the number of candidates dwindles, it is 
clear that some assessees do not “qualify” (they 
can be identified as the ones getting into the taxi 
with luggage at the end of each episode), that the 
eventual “apprentice” has been identified as having 
long-term potential, and that many of the unsuc¬ 
cessful candidates view the experience as develop- 
mentally valuable. 


Work Samples and Situational Tests 


Work Sample Tests 

name implies, work sample tests measure Job skills by taking samples of behavior 
Wider realistic joblike conditions. One of the earhest apphcations of this technique was in 
Ae idection of trolley car operators in Boston in 1910. Trolley cars frequently came into 
Wet with people, horses, bicycles, and the newly introduced automobile. In 1913, 
Bsterberg set up a “work station” to simulate the controls of the trolley car and pro- 
P*Wed events onto a screen to see how potential operators would respond. Since this 
f*udy was carried out a decade before the correlation coefficient became a common index 
^ validity, Munsterberg’s assertion that his work station predicted operator success is 
ecdotal. 

In today’s work sample tests, the performance may or may not be assessed at an actual 
ition, but the task assigned and the equipment used to complete the task are designed 


Work sample test 
Assessment procedure that 
measures job skills by 
taking samples of behavior 
under realisbc joblike 
conditions. 






52 Chapter 3 Individual Differences and Assessment 


TABLE 3.11 Some Examples of Work Sample Tests 
MOTOR WORK 

Carving dexterity test for dentat students 
Btueprint reaing test 
Shorthand and stenography test 
Rudder control test tor pilots 
ftogramming test for computer programmers 
Map reading test for traffic control officers 


mOAL WORK SAMPLES 

A test of common facts of law for law students 

Groif discussion test for supervisor 

Judgment and decision-making test for administrators 

Speech interview for foreign student 

Test of basic infmmation in chemistry 

Test of ability to Mow oral directions 



to be realistic simulations of the actual job. Consider the example 
of an individual applying for a position as a call center represen¬ 
tative (i.e., answers customer service line calls). The applicant sits 
at a computer screen and responds to a customer call. The appli¬ 
cant has to enter account information, look up order information, 
and even deal with an angry customer who is disputing a charge. 
The applicant’s score is a combination of hard skills (e.g., amount 
of time on the call, efficiency of screen navigation) and soft skills 
(e.g., anger management). As another example, an applicant for 
the job of accounts payable clerk might be given a checkbook in 
which to make entries, a work report from which to generate an 
invoice, a petty cash ledger to balance, and a payroll task. The results 
would then be compared against some standard and a score 
assigned representing the level of test performance. Table 3.11 illus¬ 
trates some work sample elements. 

Like assessment centers, work samples have a “real” feeling to 
them and usually elicit good reactions from candidates. Further, 
various studies have affirmed that work samples can be valid 
assessment devices (Hunter & Hunter, 1984; Roth, Bobko, & 


A 1920s example of work sample testing: an apparatus to test McFarland 2005). This is not surprising because work samples usu- 

the skills of prospective trolley drivers. I®*’ i" question, and it is 

easier to document their job relatedness. But like other formats, 
work samples are not intrinsically valid. Their job relatedness depends heavily on the attributes 
being assessed by the format. Using the example of the call center applicant, good 
performance may be the result of specific knowledge (the candidate is familiar with the 
software), general knowledge (the candidate is familiar with computer operations), or cog¬ 
nitive ability (the candidate is able to solve the problem presented by the task through 
trial and error). When work sample tests make unique contributions to test performance 
(e.g., above and beyond what might be predicted by a simple test of cognitive ability), it 
is likely due to general or specific knowledge. As Guion (1998) pointed out, the value of 
a work sample can be evaluated just as one would evaluate any assessment device: job re¬ 
latedness, perceived fairness, and cost effectiveness. In Chapter 5, we will describe various 
techniques used to ehcit knowledge from nuclear power plant operators, such as the “walk¬ 
through” method. This might also be considered an example of a work sample (Hedge, 
Teachout, & Laue, 1990). 


SituationalJudgment Tests 

Recently, the notion of the work sample test has been expanded to cover white-collar posi¬ 
tions by creating what Motowidlo and colleagues (Motowidlo, Dunnette, 8c Carter, 1990; 


3.4 Assessment Procedures 


153 



KURE 3.10 An Example of a Situational Judgment Exercise 


^ MhkA 10 feet long near the point where he wishes to 
i|lMStream. Under the circumstances he should: 

L Walk to the bridge and cross H. 


t Break a hole in the ice near the edge of the stream to see how deep the stream is. 

W Cross with the aid of the planks, pushing one ahead of the other and walking on them. 


.Mqlowidlo 8c Tippins, 1993) have referred to as low-fidelity simulations and others have 


rred to as situational judgment tests (SJT) (McDaniel, Morgeson, Finnegan, 
ipion, 8c Braverman, 2001). A situational judgment test presents the candidate with a 


ritten scenario and then asks the candidate to choose the best response from a series of 


latives (see Figure 3.10). A recent text in the STOP Frontiers Series does an excellent 


job of reviewing the theory, research, and applications related to situation judgment tests 
(Weekley 8c Ployhart, 2006). 

PlcDaniel and colleagues (2001) have reviewed the research on situational judgment 
tests and noted that in one form or another, such tests have been part of the assessment 



gactice of I-O psychologists since the 1920s. In a meta-analysis of 102 vahdity co- 
Kdents, they concluded that there is substantial evidence of vahdity or job relatedness 
these types of tests. They found that the single strongest component of these tests was 
eral mental ability. Nevertheless, there appears to be more to SJTs than just “g.” In a 
:nt meta-analysis, McDaniel, Hartman, Whetzel, and Grubb (2007) demonstrated that 
SJT scores have incremental vahdity above and beyond the prediction afforded by per- 
J*0^ty tests and intelhgence tests. 

ISlevenger, Pereira, Weichmann, Schmitt, and Harvey (2001) evaluated the use of SJTs 
1 hiring decisions for a government agency and a private sector transportation company, 
hiikldition to SJTs, they coUected data on personahty, cognitive abihty, technical job know- 
*®dge, and job experience of the candidates. They found that SJTs were able to improve 
*he prediction of performance even after the contributions of ah of these other variables 
,^d been controUed, and even though the SJT scores were substantiahy correlated with 
measure of cognitive ability. They suggested that SJTs are best used to measure 


Situational judgment test 
Commonly a paper and 
pencil test that presents the 
candidate with a written 
scenario and asks the 
candidate to choose the 
best response from a series 
of altematives. 















procedural knowledge (what we referred to as tacit knowledge earlier in this chapter). In 
a more recent review, Chan and Schmitt (2005) have added adaptability (which we will ] 
cover in detail in Chapter 4) to their hypothesis about what is measured by SJTs. As you | 
can see from Figure 3.11, it appears that various KSAOs produce competencies related to 
tacit knowledge and adaptability (which could also be labeled practical intelligence) and 
that these, in turn, produce positive and prevent negative job behavior. Weekley and Ployhart ■ 
(2005) suggest that SJTs assess general rather than job-specific forms of knowledge. The 
relationship between KSAOs and practical intelligence helps explain why there are posi¬ 
tive correlations between SJT scores, “g,” and personality test scores. It also helps explain 
why SJTs predict performance beyond any one or combination of those attributes—namely 
because the attributes support the development of tacit knowledge and adaptability but 
are different from any of those supporting KSAOs. This model is supported by the 
research of McDaniel and Nguyen (2001), which shows an increase in SJT scores with increas¬ 
ing years of experience. It is plausible that tacit knowledge and adaptability increase with 
experience. 

Another advantage of SJTs discovered in the study by Clevenger and colleagues was that 
the differences in scores between whites and both African Americans and Hispanics were 
considerably less than typically found in standard tests of cognitive ability. This may be a 
case of having your cake and eating it too. Not only did the SJT give a good assessment 
of general mental ability with lower adverse impact, it also measured something in addi¬ 
tion to “g” that was job related. This “something” was most likely practical intelligence as 
described above. In a follow-up study, Chan and Schmitt (2002) found once again that 
SJT scores contributed to the prediction of job performance for 160 civil service employ¬ 
ees, beyond what could be predicted from cognitive ability, personality, and job experi¬ 
ence. This study is particularly interesting because it was done in Singapore, suggesting 
that at least the format of the SJT can travel internationally. 

SJTs have also been adapted for video presentation by using video vignettes, rather than i 
a written description, to present the scenario. The results are encouraging. In two similar 
studies, Weekley and Jones (1997) and Chan and Schmitt (1997) found that black—white i j 
differences in SJT scores were smaller with a video than with a paper and pencil presen¬ 
tation, and that SJT produced more favorable attitudes toward the assessment process, par¬ 
ticularly among African American test takers. Further, Lievens and Sackett (2006) found 
that the video version of an SJT had higher validities than a written version for predict¬ 
ing interpersonal behavior in medical students. 

The results of research on the SJT are very positive. They seem to possess three impor¬ 
tant characteristics for modern and practical assessment: They are job related, they are j 




3.4 Assessment Procedures 


155 




pted by test takers, and they have reduced adverse impact compared to other 
ional assessment devices. Further, the recent research on video presentations sug- 
at further advances are likely to occur in this area, particularly in terms of increas- 
the fidelity of the simulation from low to high and in further increasing the 
nee of the format by test takers. If there is any caution in the wholesale adoption 
;JT, it is that they may be susceptible to faking (McDaniel et al., 2007; Nguyen, 
ran, & McDaniel, 2005; Peeters & Lievens, 2005), particularly when the instructions 
for “what would you tend to do?” versus “what is the best answer?” Much more needs 
be known about the extent to which the SJT (both its validity and its accuracy) are 
tible to faking. 


ULE3.4 SUMMARY 


A v^orous debate continues over whether there 
H only one overarching cognitive ability—“g” or 
general mental ability”—or several distinct 
ets or abilities. Psychologists have developed 
tests that produce a single number intended 
to represent cognitive ability, tests of specific 
‘ ilities, and test batteries designed to measure 
ral different facets of cognitive ability. 

use most physically demanding jobs 
ire combinations of physical abilities, many 
ysical ability assessment procedures use sim- 
ted pieces of work (e.g., carrying a load up a 
dder) rather than individual physical tests 
e.g., sit-ups or bench presses). There is substantial 
idence that measures of physical abilities can 
prove the prediction of job success for many 
lysically demanding jobs. 

nality testing in employment has shifted 
a screen out process to a screen in process 
ereby employers seek to identify applicants 
■with positive personality characteristics (e.g., 
fientiousness, emotional stability, or agree- 
leness). There are many commercially avail¬ 
able instnmients for measuring personality 
acteristics, many based on the Big 5 model. 

ogan, Hogan, and Roberts addressed practical 
estions about using the measurement of per- 
nality for making employment decisions. 

ctical considerations in personality testing 
include the use of integrity tests, on which “fak¬ 
ing” is sometimes an issue, emotional intelligence 
tests, and tests of interests and values. 


• It is important for employers and applicants 
to distinguish between the content of testing 
{what attribute is being assessed) and the 
process of testing (how it is being assessed). For 
example, the terms “personality” and “cognitive” 
describe the content of the assessment, and the 
terms “interview” and “background check” 
describe the process of the assessment. 

• Individual assessment is complex, involving 
a wide variety of content areas and assess¬ 
ment processes. The tools used most frequently 
include various interactive assessment tools 
rather than paper and pencil tests, as the nature 
of the position is usually so complex that no 
paper and pencil test would, by itself, provide 
sufficient information. 

• An interview plays a role in virtually every 
selection or promotion decision. Interviews vary 
in their structure and content. They can range 
on a continuum from very unstructured to very 
structured, and can cover one or more of the fol¬ 
lowing content areas: job knowledge, abilities, 
skills, personality, and person-organization fit. 

• Assessment centers have a long and successful his¬ 
tory. They are administered to groups of indi¬ 
viduals rather than single individuals, and the 
assessments are typically performed by multiple 
assessors. There is general agreement that an 
assessment center can be a valuable procedme for 
selection, promotion, and training needs analysis. 

• Other common assessment devices include 
work samples and situational judgment tests. 







